Votume X APRIL, 1914 


BIOMETRIKA 


CONGENITAL ANOMALIES IN A NATIVE 
AFRICAN RACE 


By HUGH STANNUS STANNUS, M.D. Lond. Medical Officer, Nyasaland. 


(1) I HAVE thought it would be of interest to put on record some observations 


made by myself in Nyasaland during the past seven years, on the subject which 
appears as the title of this paper. 


These observations relate to members of a native population of Bantu stock, 
belonging to several main tribes, namely, Mananja, Yao, Ngoni and Tumbuka, 


with a few references to the Nkonde in the north and the Nguru from the south- 
east. 


My interest in the subject was aroused by the frequency with which some 
abnormalities were seen and I think the facts I bring forward will go to shew that 
this unusual incidence is real and not only the result of the ease with which 
observations may be made among a partially clothed community. 


Statistics dealing with the subject, to be of value, must treat of large numbers, 
such have however only been possible in a few instances to be referred to later. 
I speak therefore largely from impressions in appraising the rarity or otherwise of 
any particular condition. It should be remembered in this direction that the 
cases now to be reported have been met with more or less casually, most of them 
while travelling on the path or in some village, few in the course of Native 
Hospital work and none in any Special Department. 


Classification is a matter of some difficulty for many reasons and as the number 
of anomalies to be described is not very large it is perhaps more convenient to 
consider the various conditions according to the anatomical part affected. 


One large section of congenital anomalies, Anomalies of Pigmentation, I have 
already dealt with (Biometrika, Vol. 1x. pp. 333—365), and they will not be 
touched on in the present paper. 


Biometrika x 1 


No. 1 
} 
4 
j 
j 
q 
‘ 
q 
q 
4 


2 Congenital Anomalies in a Native African Race 


(2) Dealing with those deviations from the normal in which there is a change 
of a more or less general nature, I refer firstly to Infantilism, at the same time 
recognising that such a condition may not constitute a truly congenital anomaly. 


To the class designated Jdiopathetic Infantilism I should relegate a woman 
aged 22 years seen in 1911 at Zomba who presented the figure and development 
of a girl of 138. There was no breast development, no pubic or axillary hair and 
the rounded coutours of the body and limbs usually associated with this age in a 
woman were wanting; menstruation had not commenced. In other respects she 
appeared normal and her mental development was but little if at all below the 
average. 


(3) In W. Nyasa I encountered a very excellent example of the Ateliotic Dwarf, 
a perfect “little man,” a man in miniature 1°25 metres in height. Another case 
which I think must be considered as one of simple dwarfism is here reproduced :— 
Samuti, aged 35, a Yao, 1:42 metres high. He is shewn together with a man of 
1:85 metres. Samuti shews no other abnormality (Plate I, (1)). 


No case of Cretinism or Myxoedematous Dwarfism has been seen. I may here 
mention that Cachetic Infantilism is well seen in some cases of spinal caries 
among Natives just as among Europeans. 


A paper on “Congenital Humeral Micromely” in the Nouvelle Iconographie 
de la Salpétriére, T. xxiv. pp. 463—471, Paris 1911, by Dr S. A. Kinnier Wilson 
and myself, contains references to two cases of Achondroplasia in Nyasaland. 
Since then I have heard of two other cases and seen a fifth :—Etimu, male, aged 
25 years, a Yao, son of Masinjiri of Ndindi’s near Chipoli, Dedza District. The 
subject stated that he had no children and that no member of the family was known 
to have been similarly affected. He is a perfect example of the condition as the 
photographs will attest, and further remarks are unnecessary (Plate I, (3) and (4)). 


The following measurements were made and tracings of his hands are here 


depicted (Fig. 1): 


(1) Head: maximum length . . 201 cm. 
(2) breadth . . 158 
(3) circumference . . 600 
(4) Nose: length, base to root . 36 
(5) breadth, across nostrils. ‘ 45 
(6) Face: bizygomatic breadth . ; ‘ : ; . 140 
(7) length, nasion to chin ‘ 
(8) to commissure of lips 
(9) Standing height . . 1182 
(10) Span of arms. . 1133 
(11) Arm: acromion to external condyle of humerus. .* 20 


( 
| 


H. S. Srannus 3 


(12) ‘Forearm: external humeral — to tip of ulnar 


(13) Forearm to tip of middle ‘ 
(14) Leg: top of iliac crest to head of fibula. . i 
(15) to external malleolus. 48° 
(17) Trunk: upper border of sternum to umbilicus . ‘ = 
(18) symphysis pubis 


Left. Fig. 1. Etimu. Right. 


(4) No case of actual Gigantism has been seen. Tallness or shortness often 
runs in families. The tallest man I have ever seen measured 1°92 metres. He 
was the father of an albinotic child and had internal strabismus but no signs of 
acromegaly (see Plate I, (2)). Another man who I have not seen but who was 
measured by Dr Davey at Kota Kota was 20 metres in height. No case of 
Acromegaly has been seen by myself. 

(5) The following case in the want of development of the lower jaw and 
zygomatic arches might be considered as the converse to acromegaly (Fig. 2). 

From the sketch the subject will at once be recognised as a type of Congenital 
Idiot, the above-mentioned features and ill-formed pinnae together with the rather 
bird-like appearance being characteristic. 

1-2 


f 


4 Congenital Anomalies in a Native African Race 


Jaidi, male, aged 20 years, a Yao of Chumbosa, Bursali, is the second child of 
a family of three, the elder brother being dead and the younger sister normal. 
No family history was elicited. 


Fig. 2. Jaidi. 


The growth of the face is defective as before noted, the zygomatic arches are 
so little developed that there are practically no cheeks. The descending rami of 
the jaws converge very considerably so that the floor of the mouth is very narrow 
and the horizontal rami are so short that the symphysis is situated mid-way be- 
tween the lower lip and the neck as they lie on one horizontal plane. The palate 
is high and narrow. 


The following measurements were made: 


Maximum occipito frontal . A ; 191 cm. 
Bizygomatic at junction of zygoma with temporal . . ee 
Face: nasion to commissure of lips . 


Right external strabismus is present and vision defective. 


Though mentally an imbecile with an impaired speech he is an excellent field 
labourer. He states that no woman would marry him but that he has had sexual 
intercourse and that he is capable of the act. 


i 
\ i 
| 
aN 
Te 
: 


H. S. Srannvus 


A few other cases of Congenital Idiocy have been seen and include an example 
of Spastic Diplegia, a Mongol Idiot aged 4 years in W. Nyasa district and two 
microcephalic idiots met with in adjacent villages in Chikala district, in neither of 
which were factors of etiological interest elicited. 


(a) Aged 22, male, looked like a boy of 12 in physical development, the head 
was very small but no measurements were made; the palpebral fissures were 
markedly slanting downwards and inwards and an internal strabismus was present ; 
the ears and palate were normal ; the hands large and like those of a man. 


(b) A male infant aged one year with so marked a degree of microcephaly as 
to approach in type anencephaly, the resemblance being the more marked as the 
protuberant eyes and lips were like those characteristically found in anencephalic 
monsters (Plate IT, (7)). 


(6) The following case is given at length (Plate UI, (5) and (6)). 


Masimosya, aged 19 years (1911), a Yao of Chipi’s village Zomba, exhibits a 
marked want of development of sexual organs (male) associated with large breasts. 
The general form of the body is that of a woman; the attitude, voice, laugh, | 
facial aspect and expression resemble those of a woman rather than of a man. 
The teeth are good, the body and limbs well developed and there is a fair deposit 
of subcutaneous fat. The breasts (see photo) are remarkable, being large, with 
large well-formed nipples and well-marked areolae, dark in colour. They have 
started to beome pendulous and resemble exactly those of a nulliparous woman 
of the same age. The abdomen is well formed and round the umbilicus there 
is a deposit of fat such as is commonly seen in women; the pelvis appears large. 
There is some hair in the axillae but none on the face or body. The pubes is 
rather prominent resembling the female mons veneris and there is some develop- 
ment of hair upon it. The penis is very small, only two inches in length and of 
infantile type, the glans is covered by a prepuce and there is no deformity. The 
scrotum is very small indeed and only contains one testicle, the left, which can be 
felt as a small body about the size of a bean, three-eighths of an inch long. The 
right testicle is not apparently present in the scrotum or inguinal canal. The 
scrotum shews no tendency to be divided nor is there anything in the arrangement 
of the skin to suggest labia. No rectal examination was made. 


The subject is insane. He is fairly tractable and good-natured. He has 
delusions and hallucinations, it is reported, with various phases of the moon, when he 
is said to travel 15 miles to bathe in a certain stream, ete. He has tried to burn 
down some houses. I could get very little of his history. The mother and father 
are said to have been normal; the only other child, a girl, was insane and died in 
the Central Asylum. The subject once cohabited with a woman who was to have been 
his wife, but she ran away the next day and I was unable to find out from him if 
he had any sexual desire. Such is a case which would have been called one of 
Partial Hermaphroditism but in the absence of further data I shall not discuss it. 


6 Congenital Anomalies in a Native African Race 


(7) Obesity. No cases of general obesity outside normal limits with possibly 
a congenital origin have been seen. Steatopygy does not occur. 

(8) Symmetrical Lipomatosis is conveniently considered here though perhaps 
not strictly within the subject. Three old women have been seen all presenting 
the same abnormal feature, namely, the presence of symmetrical lipomata in both 
axillae, each about the size of a small orange. In a fourth case the affection was 
one-sided, the subject giving a history of the gradual descent of the tumour from 
the upper aspect of the shoulder into the arm-pit. 

That these tumours were lipomata I can only support by clinical examination, 
they certainly were not of the nature of the pads seen in myxvedema and no 
signs of that disease were present. There is the possibility that they were acces- 
sory breasts but they did not present the characters found in undoubted cases of 
this condition. These tumours may have a similar pathogeny to the masses seen 
on either side of the back of the neck of men and specially described by 
Sir Jonathan Hutchinson; on account of their possible paleogenetic significance 
I have included notes on these cases here. 


(9) Lymphatism. Post-mortem examination on a boy 10 years of age who died 
after receiving a blow on the head revealed a thymus gland of considerable bulk, 
4 inches long. The blow had not severed the soft tissues over the skull and in 
the absence of any other evidence of injury or disease one might suspect the case 
to be one of lymphatism, an inherent disorder which had predisposed to death. 
In a second case, that of a woman aged 40 years who died after moderately severe 
burns, a body 44 inches long of yellow colour and firm consistency was found lying 
on the anterior surface of the heart, the apex of this body being at a level with 
the 2nd costal cartilage. 


(10) Coming now to Malformations, there is a well-defined deformation of the 
skull of which I have seen several examples, the main points of which are well 
shewn in the photographs. The extreme height of the cranium and marked 
dolicocephaly without bossing of the forehead, while the sides of the vault of 
the skull are flattened, are characteristic. The photographs depict a boy aged 7, 
son of Matikwiri, headman of Mlanje, whose two younger sisters are said to 
resemble him exactly in the deformity present (Plate IV, (12) and (13)). 


The second case is a boy aged 15 years, the head measured 21°5 cm. long and 
12°5 em. broad (Plate III, (9)—(11)). 

(11) Congenital Ptosis is not uncommon and is associated with the typical 
expression due to this disability. A slight degree of Hpicanthus may be fairly 
often observed; more marked, it is sometimes seen associated with obliquity 
of the palpebral fissures giving a regular mongolian character to the face 
(Fig. 3). 

Buphthalmos has been seen on two occasions in young adults with a history of 
its congenital nature but nothing else of note; tension normal and vision appa- 
rently good, 


‘| 


H. S. Srannus 7 


Microphthalmos was once seen associated with coloboma of the iris and choroid 
(see below). 


Coloboma. This defect was met with in two brothers aged about 18 and 17 
years, but neither parent nor, as far as 1 could ascertain, any other member of the 
family was similarly affected. Bwanali the elder presented a coloboma of the iris 
and choroid of the left eye ; there was also a small opacity on the posterior surface 
of the lens, which however could not be traced more deeply but which suggested a 
remnant of an “ arteria centralis.” 


Fig. 3. Epicanthus. 


The right cornea shewed some superficial opacities, the iris appeared normal, 
but examination of the fundus revealed a large white triangular area with the 
apex near the disc with here and there small masses of pigment. The middle 
portion of the white area was on a much deeper plane than the rest of the fundus, 
forming a posterior staphyloma, the whole composing a kind of posterior coloboma 


(Fig. 4). 


Right eye. Left eye. 
Fig. 4. Bwanali Coloboma. 


This boy also had an accessory nipple. 


The younger brother Pete presented on the right side a microphthalmic eye 
with coloboma of iris and choroid resembling the condition iu his brother, with an 


8 Congenital Anomalies in a Native African Race 
opaque spot on the posterior surface of the lens. The eye is convergent and 
vision poor; he counts fingers at one yard. The left eye is normal. 


Dermoid Cysts of the Face have been seen in the situations shewn in the sketch 
(Fig. 5). One of these was excised and found to contain the usual pultaceous 


Fig. 5. Dermoid cysts of face. 


mass mixed with hairs. These hairs examined microscopically were found to be 
spindle shaped, tapering at each end, brown diffuse and granular pigment was 
present in them. 


A relic of the cleft between the median and upper external processes of the 
foetal face was on one occasion seen as a small pit at the lower extremity of and 
just external to an epicanthal fold. 

(12) Congenital Naevus. Only two cases of naevus have been seen. One a 
woman presented a small naevus just to the left of the middle line on the forehead 
at the margin of the hairy scalp, 1 cm. in diameter. The second was a man with a 
similar growth 1 cm. in diameter on the lower lip just to the right of the middle 
line (Ching’waya of Zomba). 


(13) Ear. The general conformation of the ear varies a good deal ; some of the 
types are shewn in the sketches (Fig. 6) but all these must be considered as coming 


Fig. 6. 


within the limits of normal variation. In one case a kind of Accessory Lobule was 
noted ; the subject was an albino. A number of persons with Accessory Auricles 


Re, 
| 


H. S. Srannus 9 


have been seen. These consist of little subcutaneous nodules of cartilage forming 
tubercles one to four in number situated just in front of the tragus, the affection 
being usually bilateral. 


An abnormality seen affecting a woman in N. Nyasa consisted in the direct 
prolongation of the skin from the side of the head on to the outer surface of the 
pinna so that the upper margin of the ear was hidden, though easily felt beneath 
the skin. 


Helical fistula. Under this name have been described the remains of the first 
branchial cleft found as little pits on the helix. The condition is certainly rare 
in England and persons exhibiting the anomaly are sometimes shewn as interesting 
cases at medical societies. That heredity plays a part in its incidence is well 
known as illustrated by a case shewn by Dr Prichard at the Royal Society of 
Medicine, an infant with symmetrical helical fistulae, whose mother, four siblings, 
maternal grandmother and two great-aunts all exhibited the same defect. 
Having noted this same anomaly in quite a number of natives I became interested 
to ascertain the actual incidence. The statistics given below embody the results 
of my observations covering nearly 6500 individuals of all tribes. The popula- 
tions of whole villages were taken so that consecutive unselected persons were dealt 
with. 


Number Both 

Tribe Right Left 
Mal 416 6 3 

ales 
N. Angoni | Females 612 13 8 | 5 
Ato Males... 1941 | 34 22 12 
Females... 2576 | 69 53 
Wankonde* ... ve | 455 4 8 | 5 
Awemba 48 1 1 
Anyanja 6 1 1 
Ahenga 142 | 3 5 

Totals ... Sa | 6491 132 110 50 


Thus among 6491 individuals of all ages and both sexes a total of 292 were 
found to have helical fistula (4°5 °/,). It was more commonly unilateral, affecting 
the right side a little more often than the left, giving percentages of 2°08 and 1°69 
respectively av? for bilateral cases 0°77°/,. Taking each sex we see that the 
proportions between the three numbers are almost the same. 


2457 males cee 41 32 16 
3324 females ... 82 64 28 


* These figures were kindly supplied by Dr Davey. 
Biometrika x 2 


10 Congenital Anomalies in a Native African Race 


The actual incidence in the two sexes is however greater among females than 
males in the proportion of 5'2°/, to 3°6°/,. An abnormality occurring so fre- 
quently as 45 per mille might almost be considered to be a variation within the 
limits of the normal. The fact remains, however, that it is the persistence of 
a foetal character and abnormal, if the whole of mankind be taken into con- 
sideration. 

Dealing more in detail with this defect, there is some variation in the exact 
site of the fistula; the sketches (Fig. 7) serve to illustrate the extremes of posi- 
tion in three directions. 


Fig. 7. 

Three cases presented two pits on the same side, one each in positions A 
and B. In these three cases the affection was bilateral and symmetrical. The 
common position at which the pit is found is in D. In another case not included 
in the series a pit was observed resembling those above mentioned but situated at 
the junction of the tragus and lobule as in £. 

These helical fistulae, which I have described as little pits, consist of a small 
opening on the skin 1 or 2 mm. in diameter leading into a blind sac 1 or 2 mm. 
deep; often this sac opens out into a little ampulla which can be seen and felt 
under the skin. The ampulla and canal are generally filled with a little plug of 
sebaceous matter. 

In three cases the skin in this situation looked like scar tissue and presented 
a honey-combed appearance, there being several openings into the ampulla giving 
the impression that an abscess had formed at some past date in the ampulla with 
consequent loss of tissue. 

The fistula is so common and so unremarkable that most tribes have no name 
for it and one cannot elicit long pedigrees to shew its incidence in families. Cases 
of heredity were common enough but the type was not necessarily the same in 
members of the same family; thus a mother with Left Fistula had a child with 
Right and Left, or again, three brothers were seen two with the Left side affected, 
the third with Right Fistula. 

No malformations in connection with other branchial clefts have been seen, 


3 
¢ 
7 
2 
iy 
a 


H. S. Srannus 11 


(14) Laps, Mouth and Palate. Most natives shew a well-marked tubercle in 
the median line on the “red” margin of the upper lip; in a few however this is 
replaced by a distinct groove which involves the red margin of the lip or only the 
subjacent fold of mucous membrane (see sketch Fig. 8 and photo, Plate V, (16)). 


Fig. 8. 


These cases resemble one of a Hindu (recorded in the Lancet, Oct. 2, 1909, by 
Thurston), who besides having the median hare-lip was the subject of poly- 
dactylism. In one of my cases there was a considerable gap between the upper 
central incisors but no further abnormalities were present. 


In a single case notching of the upper lip was found to the left of the middle 
line with a mark running up to the nostril which looked like a scar. There was 
no question of any operation having been performed, though the condition 
resembled exactly an artificial repair of a lateral hare-lip (Fig. 9). <A similar 


Fig. 9. 


case has been shewn at the W. Lond. Med. Chir. Society in which there was, 
besides, a deformity of the nose and a family history of hare-lip, I have only seen 
one case of ordinary Hare-Lip, a Blantyre boy aged 10 years (1909), the affection 
being left-sided and unassociated with any cleft of the palate (Plate V, (17)). 
Among 30,000 natives examined in the northern districts of this country no 
case was seen. 


No case of typical Cleft Palate has come to my notice; on the other hand I 
have seen three cases which owing to their non-association with defects in the 
upper lip are of great interest. All three cases, one a boy aged 10 years (1906), 
the other two adult males, presented complete Absence of the Premawilla and 
attached teeth. In the boy there was also a Median Perforation in the hard 
Palate. Congenital perforations of the palate apart from clefts are apparently 
rare in Europe. Dundas Grant (Roy. Soc. Med. April 1910) has recorded the case 
of a girl aged 16 years with a perforation above and to the right of the base of 

2—2 


12 Congenital Anomalies in.a Native African Race 


the uvula with no history of trauma or syphilis. Prof. Karl Pearson has drawn 
my attention to a skull which was brought by Du Chaillu from Fernand Vas in 
the Congo (see Biometrika, Vol. vitt, Plate XXVI); this shews congenital 
absence of the premaxilla, but the two maxillae have not approximated in the 
mid-line in front as in my own cases, and we do not know the condition of the 
soft parts, but it is interesting to see this anomaly from another part of Africa. 


(15) Teeth. Native children are said to be born sometimes with teeth; it is 
possible that this is not very rare as there is a common superstition regarding 
them. I have seen one case with this history, to be mentioned later, as having 
deformities of the lower extremities. A gap of as much as } of an inch between 
the lower central incisors has been noticed a number of times, the other teeth 
all being regular and touching one another. A similar condition may be seen also 
affecting the upper pair of incisors, one that I am not conversant with among 
Europeans. Among 1500 natives examined for statistical purposes in regard to 
caries the following numerical abnormalities were noted : 


(a) Complete reduplication of the set of teeth in an adult, the second set 
lying on the palatal side of what appeared to be the normal set. I have every 
reason to believe that this was a case of true reduplication, that is to say, the 
result of growth from doubled enamel organs and not of retention of the deciduous 
teeth. 


(6) Reduplication of upper incisors. 
(c) Reduplication of right lower bicuspid. 


(d) Reduplication of both bicuspids in the lower jaw on each side and in the 
upper jaw on the right side in a woman aged 24 years. 


A single case of a Bifid Extremity to the Tongue was seen in an albino 
child. 


(16) Polymazia and Polythelia. 14 cases of these anomalies have been met 
with casually, so that I imagine this anomaly by excess is comparatively not 
uncommon. Short notes of these cases are given below for purposes of com- 
parison : 


(a) Male adult, accessory nipple springing from the skin at the right sternal 
edge opposite the 3rd intercostal space, it was large and well formed like a 
woman’s but there was nothing resembling an accessory mamma beneath it. 


(b) Male aged 45. Insane and suffering from spinal caries. There was a 
rudimentary accessory nipple in Scarpa’s triangle on the right side 14” below 
Poupart’s ligament. 


(c) Adult female, an accessory uipple on the right breast, small but well 
formed and lying above the one proper to the breast; both are patent and milk 
can be drawn through both. 


| 


H. S. Srannvs 13 


(d) Adult male, the accessory nipple is situated in a line with the left nipple 
below it and half-way between it and the costal margin. 


(e) and (f) Two women eaeh had two nipples to the right breast. 


(g) A young woman was found to have two nipples on the left breast 
(Plate V, (18)). 


(h) Male with congenital coloboma iridis mentioned above has an accessory 
nipple just above and to the inner side of the right nipple. 


(t) Young male adult has just at the outer edge of the areola of the left 
breast a very small accessory nipple, and beyond this and above it over the third 
intercostal space another flat nipple with areola and hairs. 


(j) Male, presents a rudimentary nipple in the left groin just below the 
middle of Poupart’s ligament. 


(k) Young adult male shews a small accessory nipple just below and internal 
to the right nipple; his brother, father, and grandfather are all possessed of the 
same identical anomaly. The subject has no children, no nephews or nieces. 


(4) Female in hospital with syphilis has a small accessory nipple springing 
from the skin of the chest wall just internal to the point of the left. pendant 
breast. 


(m) A woman with well-formed accessory breast in the right axilla. It is 
breast-shaped and pendant though there is no nipple. The woman volunteered 
the fact that it was a breast and said it swelled with preqeenny: The right breast 
was twice as big as the left. 


(n) An old woman with symmetrical masses in each axilla resembling rather 
the symmetrical lipomata mentioned elsewhere: see p. 6. She states that they 
appeared at puberty and thinks them to be breasts but denies that they enlarged 
with pregnancy. 

In the Japanese this condition has been shewn to be not unrare, and among 
them tuberculosis has been found to be more frequent than among the normal 
population. I can only support the idea with one case (No. 6). 


(17) Meningocoele and Spina Bifida. No typical case has been noted. A man 
was seen with a little dipple of the skin over the lower part of the sacrum in the 
median line having a little fold of skin on either side forming two small vertical 
lips. 

(18) Penis, Testicle ; Hernia. 

Epispadias, hypospadias and extroversion of the bladder have never been seen. 


I have seen a boy aged 18 years with a short penis enclosed in a fold of skin 
from the upper surface of the scrotum (Fig. 10). The boy had other deformities 
which are described later. When examining a number of recruits I was surprised 
to find in a large proportion the right testicle hanging lower than the left, the 


| 

la 


14 Congenital Anomalies in a Native African Race 


reverse of what is known to occur among Europeans. On examination of 400 
consecutive men, adults, between the ages of 30 and 40 years, I found in 166 or : 
41°5°/, the right testicle lower than the left. In the remainder or 58°5°/, the 
right testicle was on a level with the left, or rather higher in the scrotum. I also 
got the impression that, associated with right lower testicles, the testicles and 
penis were large. In another series of 280 men, the left was lower than or on the 


Fig. 10. Boy aged 18. 


same level as the right in 185; the right lower in 88. There were two cases of 
left cryptorchidism, one of right cryptorchidism ; one each left and right hydro- 
coeles and two right inguinal bubonocoeles. 


I have come across a number of cases of undescended testis among other 
natives, in some associated with a swelling in the inguinal canal, in others there 
was complete cryptorchidism. Inguinal hernia is not infrequent in adult males 
but I can give no figures relating to a large number of persons. In a single man 
it was associated with umbilical hernia. I have never seen a femoral hernia. 
Umbilical hernia is common enough especially in children. The following figures 

‘ though small in number give some idea of the frequent incidence of the condition. 
They refer to all the children in a single village and may therefore be said to be 
unselected in any way. 


| 
Age = + + + 

O— 1 year 18 13 6 

1— 2 years Ad 27 12 

102 44 10 

Totals 164 84 28 | =276, 
| 

| 
| 40 */, 


| That the hernia diminishes in size even to disappearance after childhood, as 
indicated by these figures, is certainly true as the same incidence is undoubtedly 
not found among adults. The protrusion is sometimes very marked and takes the 


4 
> 
i 
~ 
+ 
| 


H. 8. Stannus 15 


shape of the finger of a glove, some several inches long and curving downwards 
(Plate IV, (14)). Writing recently E. M. Corner in doubting the commonness of 
congenital sacs in hernia in general, as insisted on by some writers, has shewn in a 
series extending to between two and three thousand observations that herniae in 
children are often multiple and associated particularly with a ventral hernia, a 
diastema, which though very rare at birth is common in young children and of the 
nature of a true hernia. He believes that this ventral protrusion, which is certainly 
not congenital, is caused by increased abdominal pressure due to gaseous distension - 
of the bowels the result of fermentative processes, and that other herniae are due 
to the same cause. Among native children abdominal distension is almost the 
rule, “ pot-bellied” is an expression always used in speaking of them. This dis- 
tension is due largely, I believe, to fermentative processes, and also a second factor, 
absent in European children, namely enlarged spleen. Of 50 children under the 
age of 5 years taken from among those with umbilical hernia, 43 or 86 °/, were 
found to have the ventral protrusion as described by Corner. 18 of these had 
enlarged spleens and 20 shewed a considerable abdominal distension. In none 
was any other hernia found. In these cases we see par ewcellence the effect of 
intra-abdominal pressure, in producing first ventral hernia and secondly umbilical 
hernia. The weakness of the umbilical scar is due, I have littl: doubt, to the 
method of treating the cord at birth. The custom prevailing among many is to 
bind the whole cord and placenta on to the child’s abdomen till it separates ; 
with others the greater part of the cord is so treated after severing the placenta; 
in any case there must be considerable tension, I think, at the umbilicus and 
sepsis is more likely to occur. Cursham Corner has said that the size of the 
bulging is proportional to the length of cord left proximal to the ligature, and 
the same principle adapted to natives who use no ligature may be true, and 
thus account for the very “long” umbilical hernias. 


I am therefore inclined to agree with Corner that the umbilical and the ventral 
herniae of children are due largely to intra-abdominal pressure, but though my 
numbers are small, the absence of any other hernia among my cases must be 
taken to mean that for their production there is another factor to be taken into 
account, and that is, I believe, in Corner’s cases some congenital structural 
anomaly, namely a congenital sac, and, conversely, I think congenital sacs are 
uncommon among natives of this country. 


(19) Malformations of the Extremities. Various forms of Congenital Talipes 
are met with which call for nce special comment. 


A peculiar condition characterised by symmetrical shortening of the humeri 
has been observed and forms the subject of a paper by Dr S. A. Kinnier Wilson 
and myself referred to above; certain deformities of the hands and feet are also 
therein dealt with. 


Since this paper was written I have seen three other cases of Congenital 
Humeral Micromely, one of which I mention here as there is a family history of 


? 
| 
i 
| 
; 
s 


16 Congenital Anomalies in a Native African Race 


the defect, a point of some interest and one which I had not elicited in previous 
cases. 


Gobedi, male, aged 22 years, a Yao employed as a machila carrier in Zomba, 
exhibits the deformity in typical form well represented in the photograph 
(Plate V, (19)). 

The head of each humerus appears to be poorly developed and though move- 
ment at the shoulder joint is free, a certain amount of fine crepitus is elicited, 
such as was found in several of the other cases. 


The point of interest however is the fact that the maternal aunt is stated to 
have had the same congenital anomaly. 


The subject has no brothers or sisters and his own two young children are 
stated to be normals, his mother and father and more remote relations are not 
known to be affected. 


Besides these the following cases deserve mention. 


A boy was seen, 18 years of age, with a peculiar deformation of the hands, 
stated to be congenital; the fingers and thumb shewed considerable thickening 
about the Ist interphalangeal joints with marked ulnar deflection; the bridge of 
the nose was depressed, the lips very thick, and epicanthus present. There was 
also the penile deformity above mentioned, except for which I should have 
doubted the statement in regard to the congenital nature of the hand deformity 


(Fig. 11). 


Fig. 11, Boy aged 18, 


| 


H. S. Srannus 17 


I saw at Bandawe a female infant aged 1}-years presenting multiple defor- 
mities. The astragalus of the left foot was apparently implanted in a cup- 
shaped depression on the lower end of a very much shortened thigh. The femur 
of this leg was short but around it there was an abnormal amount of muscle as 
if the usual amount of muscle for a normal had been cramped up into the 
shortened limb; the foot could be freely moved by the child. The left foot had only 
a hallux and two toes with a partial cleft between the hallux and the adjacent 
toe, but I think four metatarsal bones. The right thigh was also somewhat 
shortened but the bones of the leg apparently both present, the knee-joint could 
not be distinctly made out and was flail. Right talipes equinovarus present, also 
right internal strabismus. No history of similar deformity in family, a brother a 
year older was born with two upper incisors. Father and mother normal. The 
father has two other wives with six and ten children respectively, all normal. 
Such gross congenital deformities are from time to time recorded in Europe, thus 
Lockart Mummery described a case of congenital absence of the femur in a male 
child, ete., in the Brit. Med. Jour. for November 5, 1910. 


In a male 35 years of age I found Congenital Absence of the Right Fibula, the 
tibia being bowed forward with 8 inches shortening of the limb, the foot on the 
same side had only three metatarsal bones and three digits including the hallux. 
A woman was seen with congenital shortening of one leg to the extent of four 
inches. 


A single case of unilateral Congenital Dislocation of the Hip has been met 
with. 


(20) Split Hand and Split Foot Deformities. The photograph, Plate IV, (15), 
serves to shew moderately well the deformities met with in a male child aged 
5 years (1905): in the absence of a skiagram it is impossible to go into the detail 
of the bony conditions present. There was no admitted history of similar or 
other deformity in the family. 


A second case, Ndala of Njalusi’s Mangoche, shewed a similar deformity of the 
left hand but in a less degree; he was otherwise normal and stated that no other 
members of his family were similarly affected (Plate VI, (20)). 


These cases are interesting to compare with those collected and classified by 
Lewis and Embleton in Biometrika, Vol. v1, 1908. 


(21) Shortening of the Fourth Metatarsal Bone. When first I entered the 
country my attention was attracted by a number of natives who presented a 
shortening of the fourth toe. 


Since then Captain Hughes has noticed the condition in Egypt. The descrip- 
tion he gives is as follows (Lancet, July 16, 1910):—* The fourth toe is markedly 
retracted usually behind the level of the fifth toe. The phalanges are not appa- 
rently abnormally short, and the metatarsal bone can be felt unfractured but with 
the head very much farther back than usual. Commonly the digit is pushed 


Biometrika x 3 


| 


18 Congenital Anomalies in a Native African Race 


upwards by the pressure inwards of the fifth toe. The condition is sometimes 
unilateral sometimes bilateral.” He adds that in one case the second metatarsal 
and, in another, the third metatarsal were also shortened. In a single case he 
saw a similar condition in the hand, shortening of the second and fifth meta- 
carpals. 

The above description corresponds exactly with the condition seen in this 
country. I have also seen other toes than the fourth affected, and I shew a photo- 
graph of a man’s feet with involvement of the metatarsal of the hallux; in 
another case the fifth was affected; in another case, a woman, the common variety 
was associated with shortening of the third metatarsal of the left foot (see Plate VI, 
(22), (24) and Fig. 12). 


Fig. 12. Fig. 13. 


(22) Syndactyly of various degrees has been observed; sketches of two 
examples are given in Fig. 13. 

(23) Polydactyly is not at all uncommon. I have casually come across some 
dozen cases in five years. 

In the majority the supernumerary digit consists of a miniature phalanx 
attached to the skin of the hand or foot at the level of the head of the fifth meta- 
carpal or -tarsal bone. Such digits are often removed in childhood, leaving 
a small cartilaginous nodule at the seat of removal. Most commonly it is a 
symmetrical affection of both hands and feet; in other cases hands or feet alone 
(Plate VI, (23)), or one extremity only, present the deformity. In some the 
accessory digit is well formed and an accessory metatarsal or metacarpal bone 
more or less complete is present. In one case it was the hallux which was 
reduplicated, the two digits being partially fused. In another, reported to me, 
the supernumerary digit in each hand was situated ca the radial side of the 
first finger with probably an accessory metacarpal bone in connection with it. 
The feet nad extra digits beyond the fifth toes. 

(24) The following case is of some interest: 

Chibisa, male, an Angoni of Kawenga’s, aged 30 years. The deformities in- 
volve all segments of the right upper and lower limbs and to a minor extent the 


H. S. Srannus 19 


left limbs (Plate VI, (21) and Figs. 15—18). On the right side there is shortening 
of the humerus and forearm (10 cm. difference between the two sides), but 
elongation of all the segments of the middle finger and its. metacarpal bone; the 
middle finger itself measures 10} cm. The metacarpal bones and phalanges of 
the other fingers are, I think, absolutely a little shortened. The left arm and 
hand are normal, except that this hand as also the right hand shew a little 
nodule at the base of the little finger where a supernumerary digit was removed, 


/ 


Fig. 15. Fig. 16. 
Fig, 17. Fig. 18. 


The right foot presents a similar condition to the right hand, elongation of 
phalanges and metatarsal affecting the second toe, the toe itself being 7 cm. 
long. The tibia is somewhat bowed outwards. The left foot presents shortening 
of the metatarsal bone of the hallux. 

3—2 


| 
| | 
/ 
| 


20 Congenital Anomalies in a Native African Race 

The photograph and sketches illustrate some of these points. Other measure- 
ments were as follows: 

Height 166°5 cm.; span of arms 1645 cm. ; 

Maximum fronto-occipital 18:0; maximum biparietal 13°8 ; 

Nose length and width 4-4. 


(25) Congenital Anomalies of the Kidney. Post-mortem examination on a 


native prisoner who died of pellagra revealed the presence of a double kidney 
on the left side and none on the right. 


From the sketches (Fig. 19) it will be seen that the upper part was the one 
proper to the side while the lower half was the abnormal portion. 


L.Suparenal-|-- 


Fig. 19. Churinigu. Kidney of Left side double. 


The two parts were really very distinct, partly separated by a groove and 
cleft. 


The lower viscus had been felt during life as a tumour in the abdomen of 
unknown nature as it lay along the left side of the vertebral column. The kidney 
was unfortunately removed before dissection of vessels, etc. was made, but the 
sketch shews the arrangement of these at the hilum of the kidney. 


The two ureters united below the lower pole of the double organ, the distal 
ureter being nearly twice the normal size. 


| 
N Bi 
? 


H. S. Srannvus 21 


The bladder was normal; there was no right ureter. The suprarenal body 
of the right side was in its normal position and appeared normal. No other 
abnormalities were remarked. 


(26) Some suggestive observations have been made by Dr Ewald Stier, 
published in the Deutsche Zeitschrift fiir Nervenheilkunde (Band x iv, Heft 1-2, 
S. 21), from which the generalisation is made that in all anomalies of overgrowth the 
right side of the body is much more frequently involved than the left, whereas in 
anomalies of undergrowth the left is more commonly the site of the condition 
than the right, this distribution being the result of a preponderance of persons 
with a leading or superior left cerebral hemisphere, as with left-handed persons 
the converse was found to be true. In other words, the plus anomalies occur on 
the right side in right-handed people and the minus anomalies on the left side, 
the left hemisphere being the superior hemisphere, the converse being true. 


I have therefore tabulated my observations, and though small in number they 
tend to confirm the idea assuming that the African native is right-handed. This 
remains unproved and a less marked superiority of the left hemisphere may 
account for non-conformity of my few cases to Stier’s rule. 


Right | Bilateral) Left 
Plus Anomalies : 
Polymazia ... 1 1 0 
Polythelia 10 0 5 
Minus Anomalies: | 
Hare-lip 0 0 2 
2 0 2 
Absence of Fibula 1 0 0 
and Tibia... 0 0 1 
Split’ hand, foot 0 1 1 
Shortened metacarpal, tarsal 0 1 3 
Syndactyly 0 0 
Coloboma iris 1 1 0 
Plus and Minus together : 


In considering these cases it should be remembered that the majority of my 
observations have been made casually among natives met in the bush, in villages, 
etc., others in the course of routine work among troops, prisoners, etc., the few 
were the result of special investigation. 


(27) Concluding Remarks. The notes of cases which I have thus collected 
together form rather a medley of facts but I think certain deductions may be 
made from them. 


22 Congenital Anomalies in a Native African Race 


It would appear that 


(1) The slighter the anomaly the greater the frequency with which it may 
be observed. 


(2) The more marked degrees of deformity are only seen in children and 
those in places where European influence is felt. 


(3) Cases of heredity are only seen among the lesser anomalies, 


(4) The least obvious congenital anomaly is a helical fistula, and this is 
found in 4°6 °/, of the population and is frequently inherited. 


The difference in the observed incidence between the minor anomalies and 
those of more marked proportions may be real or only apparent. I think the 
latter supposition is true for reasons which can be deduced from the facts given 
above. 


It is the custom among all the tribes of this country to destroy all deformed 
children at birth. Any minimal deformity such as a helical fistula is of course 
unrecognised, an accessory nipple is probably hardly noticeable, accessory digits 
which can be removed by a nick with a knife are matters of no import, while a 
foot with six well-formed toes would hardly be considered worthy of note. These 
abnormalities are therefore comparatively common, but hare-lip, cleft-palate, 
deformities common enough in Europe, are among the rarest in this country; a 
child with a hare-lip would be seen to resemble a hare and would be immediately 
destroyed. Children with the greater deformities would certainly be destroyed. 
In recent years under European influence native customs fall into abeyance and so 
we see my single case of hare-lip in a boy aged 10 at Blantyre, a township of 25 
years standing, a child with gross deformities of the lower extremities born prac- 
tically on a mission station; or, to quote.another example, an albino reported by 
myself was the fifth albino child born, the firs: four having been killed at birth 
by order of a chief, who in later years came under the influence of an up-country 
mission station, for which the living albino has to thank his survival. The gross 
abnormality of absence of premaxilla would pass unnoticed as the deformity is 
slight. History relates that in the case of the child with lobster claw deformity of 
hands and feet, it was only saved from a summary death by the efforts of the 
mother. 


I think with the evidence as it stands one may with fairness say that con- 
genital anomalies are common among the natives of this country. Secondly, I 
think one may also deduce from the facts stated that abnormalities of all kinds 
are at least not uncommon. In the few cases in which I have adduced statistics 
there can be no doubt, in other cases it is rather a matter of one’s impression. 


I have shewn that certain congenital anomalies among natives of Nyasaland 
are common and have attempted to argue that probably many of them are 
common. 


H. S. Srannus 23 


(28) Very few statistics are available for comparison, but I should like to refer 
to some by writers in Egypt. Prof. Madden cites in a letter to the Lancet a case 
of cleft-palate which he operated on as the first in 11 years during surgical work 
at the Kasr-el-ainy Hospital, and assigns as the cause of the lack of such cases 
the “truly awful struggle for existence” which would eliminate infants so handi- 
capped. The Lancet remarked (Lancet, July 3, 1909), in an annotation upon this 
letter, that Prof. Elliot Smith considers it to be impossible to endeavour to explain 
this rarity of congenital defects in Egypt, unless the time-honoured scapegoat of 
our too modern civilisation be invoked to account for their frequency in other 
countries. Statistics of the Kasr-el-ainy Hcspital compiled by Dr Day are quoted 
in 1907; among 2630 total surgical admissions the only congenital deformities 
were 5 hare-lips, 2 talipes, 2 imperforate anus, 1 extroversion of bladder; in 1908, 
2702 admissions, 3 hare-lips, 2 imperforate anus, 1 hypospadias, 1 undescended 
testicle, 1 meningo-encephalocoele. Capt. G. W. G. Hughes, R.A.M.C., in a paper 
to the Lancet, July 16, 1910, referring to this annotation, remarks “ Readers will 
be interested to hear that our too modern civilisation is innocent of this slur,” 
and goes on to shew that many congenital defects are by no means uncommon. 
Dealing with males between the ages of 14 and 21 years he gives the following 
figures : 

Hare-lip in 0°041 7. 

Cleft-palate 0-016 %. 

Polydactylism 0:058 and 004%, in two series. 
Shortened metatarsal 0°37 °%/ and 0:23 

Other deformities of fingers and toes 0°22 7. 
Talipes 0°16 7. 

Among the thousands of ancient Egyptian bodies which Prof. G. Elliot Smith 
has unearthed and examined, a single case of cleft-palate was met with, a female 
of 20 years of age with a skull of negroid type, of between the 4th and 6th 
century B.c.; only one case of talipes (T. equinovarus) was recorded. 


It is obvious that in Egypt surgical treatment is not sought in cases of 


cleft-palate and rarely for other congenital defects but many of them are common 
enough. 


May the rarity of defects among the ancient peoples of Egypt be. due to the 
same cause that acts in Nyasaland to-day? Were the children affected with 
deformities killed at birth and “thrown onto the dust-heap” where their remains 
were soon lost trace of? Of chief interest to me are the figures published by 
Captain Hughes. He shews that a shortening of the 4th metatarsal bone occurs 
in percentages rising to 0°37 of males examined. This defect is peculiarly common 
in this country. Again, polydactylism occurs in 0°05 °/, and other deformities of 
fingers and toes in 0°22 °/, of Egyptians, both deformities very frequently met 
with by myself in Nyasaland. 


| 
| 


24 Congenital Anomalies in a Native African Race 


He however does not mention polythelia and polymazia nor helical fistula. 
I should be very interested to learn if this last insignificant anomaly was looked 
for. There is no doubt that one of them, helical fistula, occurs with a frequency 
in Nyasaland which cannot be rivalled by any other among peoples of any race. 
I think one may also say with certainty that the incidence of others (shortened 
4th metatarsal and polydactylism) in this country is far in excess of that among 
Europeans, though probably much about the same as in Egypt. 


Upon what hypothesis can these facts be explained? Is there a single cause 
or are there many at work? These are questions which I shall not attempt to 
enter into, but by simply recording my observations I shall hope to stimulate 
others to do the same, for only by accumulating facts can it be hoped that such 
problems will ever be solved. 


“rai 
& 
A 


Biometrika, Vol. X, Part | Plate | 


Samuti, an Ateliotic Dwarf. Subgiant, Height 1:92 metres, with Wife and 
Albinotic Child. 


2 


~ 


Etimu, aged 25, an Achondroplasic Dwarf. 


— 
= 
Fin, 
: 
(1) (2) 
| 
> 
4 4 7 : 
(3) (4) 


- 


4 


Biometrika, Vol. X, Part | Plate Il 


Masimosya, aged 19, Gynaecomastos, with other features which were formerly described as those of 
Partial Hermaphroditism. 


[Case of Hydrocele testis included by an over- 
sight of Dr Stannus in the photographs, and 
engraved in consequence. Discovered too late 
to rearrange plates.] 


Microcephalice Infant. 


| 
“a 
| 
(5) (6) 
(7) (8) 


| 


Biometrika, Vol. X, Part | Plate Ill 


(11) 


(10) 


Boy, aged 15 years, showing Scaphocephaly. 


(9) 


Ey 
. ‘Tike 


1% 
4 
>: 

‘ 


Biometrika, Vol. X, Part | Plate IV 


ft 


Son of Matikwiri, aged 7, a case of Scaphocephaly. 


(13) 


(14) 


Cases of Umbilical Hernia. Split Hand and Foot in a child, aged 5. 


: 
> 
— 
— om 
(12) 
(15) 
= 


: 


Biometrika, Vol. X, Part | Plate V 


(16) (17) 


Case showing faint medium depression Blantyre boy, aged 10, with Hare-lip. 
of upper lip. 


(18) (19) 


Young woman with two nipples on left Gobedi, aged 22, with congenital Humeral 
breast. Micromely. 


3 
| 
LA, 
oe: 


i 

| 

| 

{ 


Biometrika, Vol. X, Part | ae Plate VI 


(20) Ndala, Split Hand, left only. (21) Chibisa, aged 30, elongation of all segments of middle finger, 
and its metacarpal bone. 


(23) Case of Polydactyly. 


(24) Shortening of the left great toe, 


3 
— 
5 
oe, 
(22) Shortening of the fourth metatarsal bone. 
| 


t 
if 
ok 
4 
{ 


TABLES OF POISSON’S EXPONENTIAL 
BINOMIAL LIMIT. 


By H. E. SOPER, M.A. 


In his treatise, Recherches sur la Probabilité des Jugements, Paris, 1837, 

Poisson* shows that the series of frequencies 


n! 
ri(n— 


given by the expanded terms of the binomial 


(p+q)", 
becomes in the limit, when q is diminished, and » increased, indefinitely, but so 
that nq remains finite and equal to m, the exponential series 


and he points out that the terms of this series will give the proportional 
frequencies of the occurrences 


times, in any sample, of an event, every occurrence of which is equally likely in 
the sample and independent of the other occurrences, and which is of such 
frequency that m events occur in the sample on an average. 


The series is arrived at by “Student+,” when considering the theoretical 
frequencies in sample drops of a liquid of minute corpuscles supposed distributed 
at random throughout the mass of the liquid. 


The event may also occur in time, each occurrence being supposed to take 
place with equal probability in any finite period taken as the sample, and to act 
independently of the occurrences of all the other events. A physical example, 
which appears by the closeness of the observed to the theoretical frequencies to 


* pp. 205 et seq. 
+ Biometrika, Vol. v. p. 351, ‘‘On the Error of counting with a Haemacytometer.” 


Biometrika x 4 


| 
8 
| 
coe 
: 


a 


26 Poisson’s Exponential Binomial Limit 


satisfy these conditions, is the number of a-particles discharged per {-minute or 
j-minute interval from a film of polonium *. 


In vital statistics the sample may be an individual or house or community and 
the event an accident or disease and so on. But it must be borne in mind that 
for such series as the above to be applicable the occurrence of one event in the 
sample must not preclude or influence in any way the occurrence of a second. 


The probability of # occurrences, m being the mean number, in a sample, is 
e™ m*/a! 
and in the tables which follow this is evaluated for m=0°1, 0°2... to 15°0 and for 


x =0, 1,2... up to such an integer as gives a figure in the sixth place of decimals, 
the number of places tabulated. 


The terms of the series were calculated, each by a fractional operation upon 
the preceding, beginning with the modal term and going both forward and back. 
Thus if m=7°6 the term e~”* x (7°6)'/7 ! was first calculated by tables of logarithms, 
and the succeeding terms were then obtained seriatim by the operations 

etc 
and the preceding ones by the operations 
6 5 ete. 
done with a mechanical calculator, first a multiplication and then a division. 


Seven places of decimals were thus calculated and the series is checked by the 
total, which differs from unity by the remainder (a figure in the eighth or later 
place of decimals in all the present cases) and the algebraical sum of the errors of 
seventh figure approximations. 


Poisson’s exponential series has been previously calculated to four places of 
decimals by L. von Bortkewitsch+ for values of m from 0°1 to 10°0. 


The present tables give the probability of each number of times of occurrence 
of the event. For the sums of these values, that is, the probability of occur- 
rence of the event, a given number of times or greater, or a given number of times 
or less, reference must be made to a second paper in this issue of Biometrikat, 
where such probabilities are calculated for integral values of m from 1 to 30. 


* See Rutherford and Geiger: ‘‘The Probability Variations in the Distribution of a-Particles,” 
Philosophical Magazine, Vol. xx. p. 700, 1910. See also E, C. Snow, ‘‘ Note on the Probability Varia- 
tions, &c.,” Vol. xxm. p. 198, 1911, who finds the variance of experiment from theory to be such 
as would occur once in six experiments and once in three experiments respectively of the limited time 
taken, were theory exact. In a note to the first paper H. Bateman gives a proof of the exponential series 
of probabilities arrived at from considerations of this problem. 

+ Das Gesetz der kleinen Zahlen, 1898, A comparison of the table printed therein with the present 
table shows agreement except as to the fourth figure; the nearest fourth figure is not given, in rather 
many instances, in the tables of Bortkewitsch. 

$~ Lucy Whitaker, B.Sc. ‘On the Poisson Law of Small Numbers,” Vol. x. p. 37 et seq. 


i 
3 
| 
a 


Anes. 


it Co 


TABLE of e™m*/a!: 


H. E. Soper 


27 


General Term of Poisson's Eaponential Expansion 
(« Law of Small Numbers”). 


m 
0-1 | 0-2 os | 0-5 ov 0-8 0-9 1-0 
| 
| | 
904837 | “818731 °740818 | -670320  °606531 | -548812 | -496585 | -449329 | 406570 ‘367879 
090484 | “163746 °222245 -268128 °303265 | 329287 | 347610 | 359463 | “365913 367879 
004524 -016375 °033337 ‘053626 075816 | ‘098786 | -143785 | "164661 -183940 
000151 | 001092 003334 | -007150 “012636 | -019757 | 028388 | 038343 | 049398 061313 
000004 | 000035 °000250 -000715 001580 | -002964 | 004968 | -007669 | ‘011115 015328 
— 000002 000015 | -000057 | “000158 | “000356 | -000696 | -001227 | -002001 “003066 
900001  -000004 000013 | 000036 -000081 | 000164 | 000300 -000511 
«900001 -000003 “000008 | -000019 | -000039 -000073 
| | 
| | 
| re | 18 16 | 17 | 18 19 20 
332871 | BO1194 | -272532 ‘246597 | 201897 *182684 | *165299 | *149569 | *135335 
366158 | “361433 “354291 “345236 | “334695 | 323034 “310562 | “297538 -284180 -270671 
201387 | 216860 | -230289 | -241665 | *251021 | 258428 | “263978 | 267784 | *269971 | 270671 
073842 | 086744 | 099792 | 112777 | *125510 | *137828 | “149587 | -160671 | -170982 | -180447 
-020307 | -026023 | “032432 | “039472 | 047067 | ‘055131 063575 | -072302 | ‘081216  -090224 
| 004467 | -006246 | 008432 | 011052 | 014120 | “017642 “021615 | -026029 | “030862 | -036089 
000819 | -001249 | 001827 | -002579 | 003530 | -004705 | “006124 | “007809 | “009773 | (012030 
| 900129 | 000214 | 000339 | ‘000516 “000756 | °001075 | *001487 -002008 | -002653 | *003437 | 
“000018 | 000032 | 000055 | -000090 000142 | 000215 | °000316 | 000452 | -000630 | 000859 | 
“000002 -000004 | -000008 | 000014 | “000024 | -000038 | 000060  -000090 | -000133 | -000191 | 
.QQ9001 | -000001 | 000002 | 000004 | -000006 | °000010 000016 | -000025 000038 | 
= — “000001 | 000002 | “000003 "800004 | “000007 
| 
| | 
21 2-2 a3 | @h a5 | 26 ar | a9 | 30 
| 
-122456 | -110803  *100259 | 090718 | -082085 074274 067206 -060810 | °055023 -049787 | 
257159 | 243767 °230595 °217723 | 205212 | “193111 181455 -170268 | *159567 | *149361 | 
| 268144 | °265185 | “261268 | 256516 | 251045 | 244964 | 238375 | 231373 “224042 
189012 | °196639 | *203308 | *209014 -213763 | 217572 | 220468 | -222484 | *223660 “224042 
099231 | ‘108151 °116902 °125409 “133602 | “141422 -148816 -155739 “162154 “168031 
041677 | 047587 “053775 | -060196 | “066801 | -073539 080360 | 087214 -094049 100819 
014587 | 017448 020614 | °024078 | 027834 031867 036162 “040700 “045457 “050409 
004376 | 005484 006773 008255 | 009941 -011836 | -013948 016280 | -018832 “021604 
001149 | ‘001508 “001947 | “002477 | 003106 | -003847 | 004708 | -005698 -006827 “008102 
000268 | 000369 “000498 | -000660 | 000863 | 001111 | 001412 001773 *002200  -002701 
000056 | ‘000081 | ‘000114 | -000158 | 000216 000289 | 000381 | -000496 | -000638 000810 
.000011 | 000016 “000024 , 000035 | 000049 | -000068 | .000094 | -000126 | “000168 000221 
000002 | 000003 “000005 | 000007 | 000010 | «000015 | 000021 | 000029 | -000041 “000055 
00001 | “000001 -000001 | 000002 | 000003 | -000004 | -000006 | “000009 -0000133 
q | | | 
4-2 


CWA Nite | 


& 


| 


CoO MS DS 


Co 


4 | | 
0 
1 
2 
J 
6 
9 | 
| 
| 
: 
| | 
| 
| 
8 
9 10 
10 1] 
11 12 
12 
z | 
| O 
1 | 
| } 
% | 
| 
| 5 
| 6 | 
3 4 
| 9 10 | 
5 | 11 12 
: | 12 13 | 
13 
| 14 15 
| 15 | | 


28 Poisson's Exponential Binomial Limit 


TABLE—(continued). 
| m 
3-1 32 34 | 36 | 37 38 | 40 
| | | | 
0 | 045049 | 040762 | 036883 | -033373 | -030197 | -027324 | 024724 | 022371 | 020242 -018316 0 | 
1 | 139653 -130439 | 121714 | *113469 | -105691 | -098365 | -091477 | 085009 | 078943 073263) 1 
2 | 216461 -208702 | “200829 | -192898 | -184959 | -177058 | 169233 | “161517 | “153940 -146525 
% | 223677 | -229616 | 220912 | -218617 | -215785 | -212469 -208720 “204588 | -200122 195367. 3 
4 | 173350 -178093 | 182252 | -185825 | -188812 | -191222 | 193066 194359 | -195119 -195367 
5 | 107477 *113979 | 120286 | -126361 | 132169 | -137680 | 142869 | +147713 | 152193 “156293 
6 | 055530 | -060789 | 066158 | -071604 | -077098 | -082608 | ‘088102 | “093551 | 098925 | -104196 
| -024592 | -027789 | 031189 | -034779 | -038549 | “042485 | ‘046568 | ‘050785 | 055115 -059540, | 
| 8 | 009529 | -011116 | 012865 | -014781 | -016865 | 019118 | 021538 | 024123 | -026869 -029770| 8 | 
| 9 | 003282 | -003952 | -004717 | 005584  -006559 | -007647 | 008854 | -010185 | 011643 | -013231| 9 | 
| 10 | -001018 | -001265 | -001557 | -001899 | -002296 | 002753 | -003276 | -003870 | “004541 -005292 10 | 


11 | 000287 | -000368 | :000467 | -000587  -000730 | ‘000901 | -001102 | 001337 | 001610 | -001925 | 11 | 
| 12 | 000074 | -000098 | ‘000128 | ‘000166 ‘000213 ‘000270 | -000340 | -000423 | -000523 000642 | 
| 13 | 000018 | -000024 | 000033 | 000043  -000057 000075 | ‘000097 | 000124 | 000157 ‘000197 | 13 

14 000006 | “000008 | 000011 -000014 | -000019 | -000026 | -000034 | -000044 | 000056 | 14 
| 15 | 000001 | “000001 | °000002 | -000002 -000093 | 000005 | 000006 | -CO000Y | “000011 | -000015 | 15 


~ 
: 


16 “000001 .-000001 -000001 | “600001 | 000002  -000003 | -000004 | 16 

| 17 = — | 000001 | 000001 | 17 

#1 | 42 43 yh 46 ys | 49 | 5-0 | 

= 

0 | 016573 | -014996 | 013569 | 012277 | ‘011109 | -010052 | 009095 | 008230 -007447 006738) 0 

\ | 1 | 067948 | 062981 | 058345 | *054020 | 049990 | -046238 | (042748  °039503 ‘036488 1 

2 | *139293 "132261 | °125441 ‘118845 | -112479 -106348 | °100457 | (094807 ‘089396 -084224| 2 | 

3 | -190368 | -185165 | “179799 *174305 | 168718 | -163068 °157383 | 151691 -146014 1403743 

4 | 195127 | -194424 | 193284 191736 | -189808 -187528 | *184925 | “182029 | +178867 “175467 | 4 

5 | 160004 | -163316 | *166224 | *168728 | -170827 °172525 | *173830 | *174748 175290 | 

| 109336 | +114321 | -119127 | 123734 | -128120 | 132270 | -136167 | 139798 143153 146223) 6 

| 064040 | 068593 | 073178  °077775 | 082363 | 086920 | 091426 095862 -100207 | 104445) 

8 032820 | 036011 | 039333 | 042776 | 046329 -049979 | -053713 057517 061377 065278) 

9 | 014951 | -016805 | ‘018793  -020913 | -023165 | 025545 | 028050 | 030676 033416  -036266! 9 

10 006130 | -007058 | -008081 | ‘009202 -010424 | 011751 | -013184 | ‘014724 016374 018133 | 10 

11 | :002285 | -002695 | 003159 | ‘003681 | -004264 | 004914 | 005633 006425 007294 -008242 | 11 

12 | 000781 | 000943 | -001132 | -001350 | -001599 | 001884 | -002206 | 002570 -002978 | 003434 | 12 

13 | 000246 | -000305 | 000374 | 000457 | -000554 “000667 | “000798 -000949  -001123 ‘001321 | 13 

14 000072 | -000091 | 000115 | 000144 | -000178 | 000219 | 000268 | 000325 -000393 000472 | 14 

15 | 000020 | -000026 | -000033 | 000042 | 000053 000067 | 000084 | “000104 000128 -000157 | 15 

16 | 000005 | 000007 | ‘000009 | -000012 | 000015 000019 | 000025 | 000031 -000039 | 000049 | 16 


| 000902 | 000002  -000003 | -000004 | 000005 | 000007 | 000009 | 000011 , 000014 | 17 
18 — | 000001 | *000001 | 000001 000001 | 000002 | 000002 -000003 "000004 | 18 

— | — -- “000001 -000001 | 000001 | 19 


| 
| 


| 
| 


"003698 | 003346 | -003028 | 002739 | 002479 | 0 
020708 | -019072 | -017560 | °016163 | -014873 | 
057982 | 054355 | 050923 | -047680 | 044618 | 
*108234 | *103275 | -098452 | 093771 | 089235 | 


| 
| | 
} 


| 
| 
56 57 | 58 59 6-0 
| 
| 


_| 

| | 

| 0 | 006097 | 004992 | 004517 -004087 
1 | 031093 | -028686  -026455 | -024390 | -022477 

2 079288 , 074584 | ‘070107 | -065852 | 061812 

3 | “129279 | +123856 | “118533 | “113323 
| 


Ce 


a | 5 52 53 | | 
| 
| | 


H. E. Soper 


29 
TABLE—(continued). 
| m 
51 | 52 53 54 |. 5S 56 | 5°8 59 6-0 
4 | °171857 | “168063 -164109 -160020 | *155819 ‘151528 “147167 | °142755 | 1138312 “133853 | 
5 | 175294 | -174785 -173955 | -172821 | “171401 | -169711 | “167770 | “165596 | “163208 *160623 | 
6 | 149000 | -151480 159660 | “155599 | “157117 ‘158397 | °159382 -160076 | “160488 | *160623 | 
| 108557 | °112528 116343 | -119987 123449 -126717 | *129782 | *132635 | “135268 “137677 | 
8 | 069205 073143 -077077 | 080991 | 084871 -088702 | 092470 -096160 | 099760 | “103258 | 
9 | 039216 | 042261 -045390  -048595 | 051866 | -055192 -058564 | 061970 | 065398 | 068838 | 
10 | 020000 | ‘021976 -024057 | -026241 “028526 -030908 “033382 | 035943 | “038585 041303 | 
11 | 009273 | 010388 | -011591 012882 014263 015735 017298 | -018952 | 020696 °022529 | 
12 | -003941 | -004502 -005119 | -005797 | 006537 -007343 | -008216 | -009160 | “010175 -011264 | 
| 001546 -001801 -002087 | 002408 | 002766 -003163 -003603 | 004087 | ‘004618 005199 | 
14 000563 | 000669 000790 | ‘000929 °001087  -001265 | 001467 | -001693 °001946 | °002228 | 
15 000191 “000232 -000279 | -000334 | 000398 000472 000557 | -000655 | 000766 | “000891 | 
16 000061 | 000075 | 000092 | 000113 | 000137 , 000165 | -000199 | -000237 | *000282 000334 
17 | 000018 | -000023 | -000029 | -000036 | 000044 | 000054 | 000067 | 000081 000098 | 000118 | 
18 , 000005 -000007 | -000008 | ‘000011 | 000014 | -G00017 | 000021 | 000026 | 000032 | ‘000039 | 
19 | 000001 | 000002 -000002 | «000003 | 000004 | 000005 | -000006 | -000008 | -000010 | “000012, 
— — 000001 000001 -000001 | -000001 | 000002 | -000002 | °000003 | “000004 | 
| = — | = -— 000001 | 000001 | *000001 
| | | 
| 
0 | 002243 | 002029 | 001836 | 001662 | “001508 | 001360 | -001231 | 001114 | -001008 | “000912 
1 | 013682 | 012582 | -011569 | 010634 | -009772 | 008978 | 008247 | 007574 | ‘006954 | 006383 
2 | 041729 | -039006 | -036441 | -034029 °031760 | -029629 | -027628 | -025751 | “023990 | 022341 | 
3 | 084848 | 080612 | 076527 | 072595 | ‘068814 065183 | "061702 | “058368 | -055178 | 052129 
4 | °129393 | *124948 | 120530 | 116151 | “111822 | +107553 | “103351 | -099225 ‘095182 | 091226 
5 | 157860 | 154936 *151868 | 148674 | *145369 -141969 | -138490 | +134946 | +131351 | *127717 
6 | 160491 | *160100 | +159461 *158585 *156166 | “154648 | +152939 | “151053 | *149003 
7 | 139856 | *141803 | 143515 144992 146234 *147243 “148020 | "148569 | *148895 | 149003 
8 | “106640 | *109897 | 113018 115994 “118815 *121475 | *123967 | "126284 | "128422 | 130377 
9 | 072278 | ‘075707 | 079113 | 082484 085811 089082 | 092286 | -095415 | ‘098457 101405 
10 | 044090 | 046938 | 049841 | -052790 | “055777 0! 58794 | 061832 | -064882 | -067935 | -070983 
11 | 024450 | *026456 028545 -030714 032959 “035276 | “037661 | -040109 | “042614 ‘045171 
12 012429 | -01366Y | ‘014986 ‘016381 °017853 -019402 | ‘021028 | 022728 | ‘024503 | 026350 
13 | 005832 | 006519 -007263 008064 | ‘008926 -009850 | ‘010837 | ‘011889 | -013005 | 014188 
14 | 002541 | “002887 -003268 -003687 | 004144 | ‘004644 | 005186 | 005774 | 006410 007094 
15 | -001033 | 091193 -001373 -001573 | 001796 002043 | 002317 | 002618 | 002949 -003311 
16 | 000394 | 000462 | -000540  -000629 | 000730 000843 | -000970 | 001113 | “001272 001448 
17 | 000141 | 000169  -000200 | -000237 | 000279 -000327 | “000382 | “000445 | 000516 000596 
18 | 000048 | 000058 | -000070 | 000084 000101 -000120 | 000142 | ‘000168 | 000198 000232 
19 | 000015 | ‘000019  -000023 | 000028 | ‘000034 -000042 | 000050 | 000060 | 000072 000085 
20 | 000005 | ‘000006  -000007 | ‘000009 000011 “000014 | 000017 | 000020 | -000025  -000030 
21 | 000001 | -000002 -000002 | -000003 | 000003 -000004 | -000005 | 000007 | -000008 000010 | 
22,5 | 000001 | -000001 000001 000001 | ‘000002 | “000002 | -000003 -000003 
23 | = = | — 000001 -000001 000001 


15 


4 
5 
6 
8 
q 9 
0 
1 
2 
15 
16 
17 
18 
19 
20 
21 
0 
1 
2 
3 
4 
6 
7 
8 
9 
10 
11 
12 
13 
14 
16 | 
17 
18 | : 
19 | 
20 
21 
22 
4 


30 


Poisson’s Exponential Binomial Limit 


TABLE—(continued). 
| 

-| 
| | 738 Th 16 | | 19 8-0 

| 0 | 000825 747 | 000676 | -000611 | ‘000553 000500 | 000453 -000410 | 000371 | 000335 | 
| 1 | 005858 | ‘005375 | 004931 | 004523 | ‘004148 -003803 | ‘003487 003196 002929 | 1 
| @ | 920797 | 019352 | °018000 | 016736 | 015555 014453 013424 012464 011569 | 010735 | 2 
| 3 | 049219 | 046444 | 043799 | 041282 | 038889 036614 | ‘034455 032407 030465 | ‘028626) 
4 087364 | 083598 | ‘079934 | 076372 072916 °069567 °066326 063193 060169 | ‘057252 4 | 
& | *124057 | 120382 | 116703 | 113031 “109375 105742 | °102142 -098581 °095067 | ‘091604| | 
| 6 | 146800 | "144458 | 141989 | 139405 | "136718 *133940 °128156 125171 | 122138] 6 
| *148897 , °148586 | 148074 147371 | "146484 °144191 +142802 *141264 | °139587| 7 
8 132146 | 133727 | °135118 136318 °137329 *138150 138783 -139232 "139499 | °139587 8 
9 | 104249 | 106982 | "109596 *112084 | *114440 | 116660 118737 *120668 122449 | 124077} 9 
10 | 074017 | ‘077027 | 080005 082942 | ‘085830 088661 091427 -094121 096735 | (099262 | 10 | 
11 | 047774 050418 | 053094 055797 | 058521 °061257 063999 —-066740 -069473 °072190| 71 | 
12 | 028267 | 030251 | 032299 034408 | 036575 ‘038796 041066 043381 | 045736 | 048127 | 12 | 
13 | 015438 | °018137 ‘019586 | 021101 | ‘022681 | 024324 -026029 027794 | 029616 | 13 | 
14 | 007829 -008616 | 009457 010353 | 011304 012312 013378 014502 015684 | 016924} 14 | 
15 | 003706 004136 004603  -005107 005652 006238 | 006867 007541 008260 ‘009026 15 | 
| 16 | 001644 | “001861 -002100 | -002362 | 002649 002963 -003676 | 004078 004513 | 16 | 
17 | 900687 -000788 | 000902 001028 | ‘001169 °001325 001497 001687 | 001895 | 002124 
18 | 000271  -000315 | -000366 -000423 | -000487 000559 | 000640 000731 000832 000944 18 
19 | 000101 | 000119 -000141  -000165 | 000192 000224 | *000259 -000300 000346 000397 19 
209 | 000036 000043 | *000051 000061 | 000072 | 000085 000100 | -000117 ‘000137 | 000159 20) | 
21 | 000012 | 000021 | 000026 -000031 ‘000037 000043 ‘000051 ‘000061 | | 
| 22 | 000004 | 600005 *000006 | 000007 | “000009 00011 | 000013 000015 | 000018 °000022 | 22 | 
| 32 | 900001 000002 | 000002 | -000002 | 000003 -000004 | 000004 —-000005  *000006 ‘000008 23 | 
24 — 000001 | 000001 | 000001 000001 “000001 -000002 “000002 000003 | 24 
25 — — | = 000001 | 000001 000001 | 25 

x 8:1 82 | 8&3 84 85 | 86 8-9 90 | 
—|— |__| 
0 | 000304 | 000275 | ‘000249 | 000225 | 000203 000184 000167 | 000151 | 000136 | ‘000123 
1 | 002459 002252 002063 | ‘001889 | -001729 | 001583 | ‘001449 | 001326 001214 | ‘OO1111, J 
2 | 009958 | -009234 -008560 | °007933 | 007350  -006808 006304 ‘005836 005402 | 004998 2 
| 3 | 026885 | 025239 | 023683 | 022213 | 020826 019517 018283 017120 | ‘016025 | ‘014994 3 
| 4 | 054443 | 051740 049142 | 046648 044255 041961 °039765 | 037664 | (035656 | ‘033737 
| | 088198 | 084854 °081576 ‘078368 | 075233 072174 ‘069192 -066289 | 060727 | 
| 6 | 119067 | 115967 112847 | 109716 | ‘106581 *103449 °100328 -097224 | ‘094143 | ‘091090 | 6 
7 | 1137778 | *135848 | *133805 | *131659 | “129419 127094 °124693  °122224 119696 | 117116) 7 
8 | -139500 | -139244 | -138823 | 138242 | 137508 *136626 *135604 | “134446 | “133161 | 131756 | 8 
| -126866 °128025 *129026 | 129869 °130554 “131084 ‘131459 | 131682 | 131756 | 9 
10 | 101696 | 104031 | “106261 | “108382 | “110388 °112277 “114043 *115684 | *117197 | “118580 | 10 
11 | 074885 | 077550 ‘080179 | 082764 | 085300 -087780 090197 | ‘092547 | ‘094823 | 097020 | 11 
| 12 | 050547 | 052993 | °055457 ‘057935 060421 062909 065393 -067868 | ‘070327 | °072765 | 1? 
| 13 | ‘031495 | ‘033426 | 035407 | 037435 039506 041617 043763 045941 | 048147 | 050376 | 13 
| 14 | 018222 | 019578 | 020991 | 022461 | 023986  -025565 ‘027196 | 028877 | 030608 | 032384 | 14 
| 15 | 009840 | ‘010703 011615 012578 013592 014657 015773 | ‘016941 | “018161 | 019431 | 15 
16 | 004981 005485  -006025 | 006604 007221 007878 ‘008577 | 009318 | 010102 | 010930 | 16 
| 17 002373 002646 002942 | 003263 003610 “004389 | 004823 | 005289 | 005786 | 17 
18 | 001068 | ‘001205 001356 | ‘001523 ‘001705 | ‘001904 | ‘002121 002358 | ‘002615 | 002893 | 18 
000455 | 000520 | 000593 | 000673 000763 | -000862 | 000971 | 001092 | -001225 | 001370 | 19 
000184 | | | 000423 | 000481 | 000545 | 000617 | 20 


| 
a 
| 
| 


82 | 

| 
000071 | 000083 
2 | 000028 000031 
000009 | 000011 
| 000004 
“000001 | -000001 
= | 
| 

| 
| 
000112 | ‘000101 | 


& 
Site 


001016 | 000930 | 
004624 004276 | 
| 014025 | -013113 | 
031906 | 030160 | 
058069 | :055494 


*O88072 | “085091 
*114493 | *111834 
*130236 | *128609 


“131683 | *131467 
-119832 | 120950 
-099133 | *101158 

| 075176 | ‘077555 
052623 | 054885 

034205 | -036067 | 

| 020751 | °022121 | 

‘011802 | 012720 | 

006318 | 006884 | 

-003194 | 003518 | 

001530 | 001704 

000696 | 000784 

(00302 | -000343 | 

000125 | ‘000144 | 

000049 | -000057 | 

000019 | -000022 

000007 | ‘000008 

000002 | 000003 

000001 | -000001 


10°2 


| 
| 
‘000041 
000415 
002095 
007054 


“000037 
“000379 
*001934 
*006574 


H. E. Soper 31 
TABLE—(continued). 

| 
| 84 85 87 8-8 | 89 | 

| 

|— | | 
000097 | 000113 000131 | 000152 000175 000201 | 000231 | 000264 | 2/ | 
-000037 | 000043  -000051 | 000059 | | 000093 | “000108 | 22 | 
.000013 | 000016 000019 | 000022 -000031 000036 000042 | 23 
-000005 | 000006  -000007 | 000008 -009009 “000011 | -000013 | -000016 | 24 | 
-000002 | -000002 -000002 | 000003 | 000003 -000004 | “000005 | 000006 | 25 
= 000001 | 000001 | 000001 *000001 | 000001 | 000002 | -000002 | 26 | 
| | - | -000001 | -000001 a7 
93 | Oh | 95 | ve | 97 | 98 | 99 | 100 | # 
“000091 | *000083 wos -000068 | °000061 | 000055 | 000045 | 0 
-000830 | ‘000778 | ‘000711 | 000650 000594 000543 | 000497 000454, 
.003954 | ‘003655 | ‘003378 | ‘003121 | 002883 -002663 ‘002459 "002270, 
‘012256 | 011452 | 010696 | ‘009987 | 009322 008698 | 008114 7567 | 3 | 
028496 | ‘026911 | °025403 | 023969 | 022606 “021311 “020082 | 018917 4 | 
-053002 | °050593 | 048266 | °046020 | ‘043855 °041770 -039763 | 037833 | | 
-082154 | 079262 °076421 | ‘073632 070899 -068224 "065609 | °063055 
“109147 | °106438 | °103714 | *100981 098246 095514 | “092790 0900797 
-126883 | °125065 | °123160 °121178 “119123 | “117004 | *114827 "112599 8 
‘131113 | °130623 | +130003 | -129256 | °128388 | 127405 | °126310 | “125110 9 
*121935 | °122786 .°123502 | *124086 | “124537 | °124857 | °125047 | °125110 | 
-103090 | 104926 | °106661 | °108293 “109819 | 111236 “112542 113736 11 | 
079895 °082192 “084440 | -086634 | 088770 | 090843 °092847 | 094780 12 
‘057136 °059431 061706 | 063976 | 066236 | ‘068481 070707 | 072908 13 
‘037968 039904 °041872 | 043869 | °045892 | °047937 050000 | °052077 14 
“023540 025006 | 026519 | 028076 | 029677 | °031319 033000 | °034718 15 | 
013683 -014691 | -015746 | ‘016846 | ‘017992 “019183 | 020419 | 021699 16 | 
007485 -008123 008799 | 009513 | ‘010266 | 011058 ‘011891 | 012764 17 
003867 004242 | 004644 | 005074 | 005532 | 006021 | 006540 | 007091 18 
001893 002099  *002322 | 002563 | ‘002824 | °003105 | “003408 003732 19 | 
 *000986 *001103 | 001230 -001370 | 001522 *001687 | ‘001866 20 
“000390 °000442 000499 | 000563 000633 000710 *000795 | 000889 21 
000165 000189 000215 | 000245 000279 | -000316 *000358 | 000404 22 
°000077  *000089 | -000102 *000118 | °000135 000154 | 000176 | 23 
(090026 -000030 | ‘000035 | 000041 000048 | 000055 | 000064 | -000073 | 24 
 *000013 | 000016 | *000018 | °000022  -000025 000029 25 
 *000004 *000005 -000006 | -000007 -000008 000010 000011 26 
000001 | 000002 | -000002 | *000002 | ‘000003 “000004 000004 | 27 
— -000001 | -000001 *000001 | -000001 -000001 000001 28 
= = | — 000001 | 29 | 
= \— | | 
10°3 10°4 10°5 10°6 10°7 10°8 | 10°9 11°0 

| | 
-000034 -000030 *000028 | 000 25 | °000023 “900020 000018 | ‘0000170 
000346 “000317 *000289 000264 | ‘000241 :000220 | 000201 -000184 J | 
001784 001646 °001518 | 001400 | 001291 “001190 | “00 001097 001010, 2 
006125 | 005705 005813 | 004946 | 004603 “004283 "003705 | 3 


| & 
| 
[= 
4 
6 
| 8 
| .9 
| 10 
| 11 
| 12 
13 
| 14 
| 15 
| 16 
17 
| 18 
| 19 
| 20 
| 2] . 
29 | 
£ 
j | O| 
1 
| -2 
3 
° 


Poisson's Exponential Binomial Limit 


32 
TABLE—(continued). 
| 
ne 
| x 
| 10-1 10°2 10°3 | 104 | 105 | 106 | 107 10°8 10:9 | 11:0 | 
| 
4 | 017811 016764 | 015773 | 014834 | ‘013946 | 013107 | 012313 011564 -010856 ‘010189 | 4 
5 | 035979 | 034199 | 032492 | 030855 | ‘029287 | 027786 026350 024978 | 023667 °022415| 5 
6 | 060565 | 058139 | 055777 | 053482 051252 049089 046991 044960 | 042995 °041095, 6 
7 | 087387 | 084716 ‘082072 | ‘079458 ‘076878 °074334 | ‘071830 069367 | 066949 ‘064577 
% | 110326 | -108013 | 105668 | -103296 -100902 -098493 096072 093646 | ‘091218 “088794 | 8 
9 | *123810 | °122415 | 120931 | *119364 117720 *116003 | ‘114219 “112875 | “110475 "108526 | 9 
10 | 125048 | 124863 | °124559 | 124139 *123606 *122963 | "122215 *121365 "120418 °119378 | 10 
11 | 114817 | ‘115782 | °116633 | “117368 “117987 *118492 | *118882 | *119159 | “119323 | 11 
12 | 096637 | 098415 | “100110 | “101719 “108239 | “104667 | *106003 | *107243 | “108386 109430) 12 
3 | 075080 | 077218 | ‘079318 | 081375 °083385 | “085344 | °087248 | “089004 "090877 092595 | 13 
14 | 054165 | 1056259 | 058355 | 060450 062539 | 064618 066683 | 068730 | 070754 072753 | 14 
15 | 036471 | 038256 | ‘040071 | 041912 °043777 045663 | 047567 049485 | ‘051415 053352 | 15 
16 | 023022 | 024388 | *025795 | 027243 028729 | 030252 | 031810 | -033403 | 035026 036680 | 16 
17 | 013678 | 014633 | ‘015629 | 016666 | -017744 | 018863 | ‘020022 021220 022458 028734 | 17 
18 | (007675 | 008292 | ‘008943 | 009629 | 010351 | ‘011108 | ‘011902 | ‘012732  °013600 °014504 18 
19 | 004080 | 004451 | “004848 | 005271 005720 | 006197 | ‘006703 | 007237 | “007802 “008397 | 19 
20 | 002060 | 002270 | 002497 | 002741 003003 | ‘003285 | “003586 | 003908 | 004252 004618 | 20 
21 000991 | 001103 | °001225 | 001357 001502 | -001658 | “001827 | -002010 | “002207 002419 | 21 
22 -000455 | ‘000511 | 000573 | 000642 -000717 | :000799 | ‘000889 000987 | 001093 | 001210) 22 
23 | 000200 “000227 | 000257 | 000290 000327 | 000368 | “000413 “000463 000518 | "000578 | 23 
24 | 000084 | 000096 | 000110 | 000126 000143 | -000163 | ‘000184 “000208 “000235 "000265 | 24 
25 | -000034 | -000039 000045 | 000052 -000060 | -000069 | 000079 000090 000103 000117 | 25 
26 | 000013 “000015 | 000018 | 000021 000024 -000028 | 000032 000037 000043 ‘000049 | 26 
| 000005 | 000006 | -000007 | 000008 | -000009 000011 | 000013 ‘000015 000017 “000020 | 27 
28 | 000002  -000002  -000003 | 000003 | 000004 | «000004 | 000005 000006 | -000007 | “000008 | 28 
29 | 000001 | 000001 | 000001 | 000001 000001 | ‘000002 000002 | ‘000002 | 000003 | 000003 29 
| — | 000001 000001 000001 | 000001 | “000001 
| | | 
} | | | | | | | 
| | | 
0 | 000015 | 000014  -000012 | 000011 | 000010 000009 | °000008 000008 | “000007 | 000006 | 0 
1 | -000168 | “000153 000140 | 000128 | ‘000116 | “000106 | 000097 000089 | 000081 | 000074 | 
2 | 000931 | 000858 000790 | 000727 000670 °000617 000568 -000522 | 000481 | 000442 | 2 
4 | 003445 | 003202 | 002976 | 002764 -002568 002385 ‘002214 002055 | 001907 | 001770 | 3 
4 | 009559 | 008965 | -008406 | 007879 | -007382 “006915 | 006476 | 006062 | 005674 | 005309 4 
5 | 021221 | 020082 | -018997 | 017963 °016979 016043 | 015153 ‘014307 | 013504 | ‘012741, 5 
6 | 039259 | 037487 | 035778 | 034130 | 032544 031017 | 029549 028137 | 026782 | 025481 6 
» | «062253 | 059979 | 057755 | 055584 | 053465 | 051400 | 049388 | 047432 | 045530 | 0436827 | 
8 | 086376 | 083970 | 081579 | 079206 076856 | °074529 | 072231 069962 | -067725 | 0655238 | 
9 | 106531 | °104496 | 102427 | *100328 098204 -096060 | ‘093900 | 091728 | 089548 | "087364 | 9 | 
10 | 118249 | +117036 | °115743 | °114374  -112935 | *111430 | -109863 | “108239 | *106562 *104837 | 10 
11 | 119324 | -119164 “118899 | -118533  -118068 *117508 | *116854 | “116110 | *115281 | “114368 17 
12 | 110375 | 111220 | 111964 | -112607 | “113149 °113591 | 113933 *114175 | 114320 | "114863 | 12 
3 | 094243 | 095820  -097322 | 098747 | 100093 “101358 | “102539 | “103636 | 104647 | 105570) 13 | 
14 | 074721 | 076656 | ‘078553 | 080409 | -082219 ‘083982 ‘085694 -087350 | 088950 090489 14 
15 | 055294 | 057236 059177 | 061110 | ‘063035 | 064946 | 066841 | ‘068716 070567 072391 | 15 
16 \ 038360 040065 041793 | ‘043541 | 045306 | ‘047086 | ‘048877 -050678 | 052484 054293 | 16 
17 | 025047 | 026396 | 027780 | ‘029198 | ‘030648 | 032129 | -033639 (035176 | ‘036739 | ‘038325 | 17 | 
| 015446 | 016424 | ‘017440 | 018492 | 019581 | 020706 021865 | ‘023060 | 024288 | 025550 | 18 | 
19 | 009023 009682 | ‘010372 | °011095 011852 | 012641 | ‘013465 | 014322 | -O15212 | 016137 19 


| 4 
| 
| 
| 
| 


H. E. Soper 33 
TABLE—(continued). 


| | aes | | as | | a7 | | | 120 | 


20 | 005008 005422 005860. 006324 -006815 | 007332 | 007877 °008450 | -009051 | 009682 20 
| 21 | (002647 | 002892 | -003153 | 003433 -003732 | 004050 | 004388 004748 | 005129 | -005533 | 
| 22 001336  -001472 001620 -001779 -001951 | 002136 002234 002547 -002774 | 003018 | 


000645 | -000717 000796 | -000882 000975 -001077 001187 -001307 | 001435 | 001575 23 
24 | 000298 -000335 -000375 -000419  -000467 | 000521 000579 000642 -000712 | 000787 74 


000132 | -000150 000169 000191 | 000215 | 000242 -000271 “000303 | 000339 | ‘000878 25 
26 000057 | ‘000065 -000074 -000084 | 000095 000108 -000122  -000138 000155 “000174 26 
27 | 000023 | 000027  -000031 -000035 | 000041 | 000046 -000053  -000060 | 000068  -000078 | 27 
28 | 000009 | 000011 ‘000012 -000014 | -000017 000019 | -000022 000025 | -000029 000033 | 28 
29 | “000004 | 000004 -000005 -000006  -000007 | -000008 | 000009 | 000010 -000012 | ‘000014 29 


30 | 000001 | 000002 | 000002  -000002  -000003 -000003 -000003  -000004 | 000005 -000005 | 30 
31 — 000001 -000001 000001  -000001  -000001 000001 000002 | 000002  -000002 | 41 
— | | 000001 | -000001 | “000001 | 32 

| 12-2 12:4 12°6 12-7 | 128 129 | 130 | 
0 000006  -000005 , -000005 | 000004  -000004 “000003 | 000003 000003 “000002 | 000002} 0 
-000067 -000061 | 000056  -000051 | 000047  -000042 | -000035 000032 | 000029 
2 000407 000374 | “000344 | 000317 000291 000268 | 000246 000226 “000208 | ‘000191 | 
% 001641 -001522 | 001412 -001309 | -001213 | 001124 | 001042 000965 000894 | “000828 | 
004966 | -004643 | -004341 -004057 | -003791 003541 | 003307 | -003088 | 002882 | 002690 | 4 
5 | 012017 011330 | 010679 | -010062 | -009477 | 008924 | -008400 | -007905 | 007436 | 006994 | 
| 6 | 024233 | -023037 | ‘021892 -020794 | -019744 | 018740 | -017781 | 016864 | “015988 | 015153 |. 6 
| 7 | 041889 | 040151 | “038467 036836 | -035258 | 033733 | -032259 | -030837 | -029464 | 028141) 7 
8 | 063358 ‘061230 059142 | -057095 | -055091 | 053129 | 051212 | 049339 | 047511 | “045730 | 8 
9 | -085181 | -083000 | ‘080828 | 078665 | -076515 074381 | ‘072266 | 070171 | -068100 | “066054 | 9 
10 | *103069 | -101261 | -099418 -097544 | 095644 °093720 | ‘091777 | 089819 | “087849 | “085870 | 10 
11 | -113376 -112308 | “111168 "109959 | 108686 *107352 | “105961 | -104516 *103023 | “101483 11 | 
12 | 114321 | -114180 | *113947 -113624 | -113215 | 112720 | -112142 | -111484 | “110749 | “109940 12 
13 | 106406 | 107153 °107811 | :108380 | -108860 *109251 | -109554 | -109769 “109897 | "109940 13 
| 091965 | ‘093376 094720 095994 | -097197 098326 | 099381 *100360 | -101263 | *102087 | 14 
15 | -074185 | -075946 | 077670 -079355 | 080997 | -082594 | 084143 “085641 “087086 | 088475 18 
16 | 056103 | 057909 | “059709 | -061500 | -063279 065043 | 066788 | 068513 | 070213 | “071886 16 
17 | 039932 | -041558  °043201 -044859 | -046529 | 048208 | 049895 | 051586 053279 054972 | 17 


18 | 026843 | ‘028167 | ‘029521 ‘030903 | -032312 °033746 | 035204 | 036683 | -038183 039702 18 

19 | ‘017095 | 018086 | ‘019111  -020168 | 021258 ‘022379 | 023531 | -024713 | 025925 -027164 | 19 

20 | 010342 | 011033 | 011753 | 012504 | 013286 | 014099 | -014942 | -015816 | -016721 | 017657 | 20 
005959 | 006409 | 006884 -007383 | ‘007908 , 008459 | 009036 | 009640 | 010272 010930 | 21 

-003278 | -003554 | 003849  -004162 | 004493 004845 | 005216 | 005609 | 006023 | ‘006459 | 22 

‘001724 | 001885 | 002058 002244 | 002442 | 002654 | 002880 | ‘003122 | 003378 003651 | 2 

24 | 000869 | 000958 | 001055 -001159 | 001272 | 001393 | 001524 | -001665 | ‘001816 | 001977 | 74 

25 \ 000421 | -000468 | 000519 -000575 | -000636 | 000702 | ‘000774 000852 | ‘000937 | 001028 | 25 

26 | -000196 | 000219 | 000246 000274 | 000306 | ‘000340 000378 | 000420 | 000465 | 000514 | ~ 

27 | 000088 | 000099 000112 -000126 | 000142 | ‘000159 | 000178 | 000199 | 000222 | ‘000248 27 


28 \ 000038 | 000043 -000049 -000056 | -000063 | °000071 | 000081 | 000091 “000102 


29 | 000016 -000018 “000021 -000024 | -000027 | 000031 | 000035 | 000040 000046 | “000052 | 29 


30 | 000006 000007 -000009 ‘000010 | -000011 | 000013 | -000015 | 000017 | 000020 | ‘000022 30 
31 | -000002 | -000003  -000003 “000004 | 000005 | -000005 | -000006 | -000007 000008 | -000009 31 
32  -000001 -000001 000002 | -000002 | -000002 | -000002 | -000003 | 000003 | 000004 | 32 
— 000001 | 000001 | 000001 | 000001 | 000001 000001 | -000002 | 33 
— | - | GOL 34 
| | | | vee | 


Biometrika x 5 


| 


34 


Poisson’s Exponential Binomial Limit 


m 
| # 
13°1 13-2 | 183 | | 185 | 186 
0 | 000002 | -000002 000002 -000002 | -000001 ‘000001 
1 | 000027 000024 °000022 000020 -000019 000017 
2 | 000175 “000161 | 000148 , 000136 000125 | ‘000115 
3 | 000765 000709 | 000657 -000608 -000562 | ‘000520 
| 002510 002341 | 002183 -002035 -001897 | -001768 
5 | 006575 | 006180 | 005807 005455  -005123 | ‘004810 
6 | 014356 013596 | °012872 | 012183 011526 “010902 
| 026867 ‘025639 | 024458 -023322 | 022230 | 021181 
8 | 043994 | 042304 | 040661 039064 | 037512 | -036007 
| 064036 | 062046 | ‘060088 | 058161 056269 054410 
10 | -083887 | ‘081901 ‘079916 -077936  -075963 073998 
11 | 099901 | 098281 *096626 -094940  -093227 091489 
12 | *109059 | “108109 | “107094 | “106017 “104880 | 103687 
| 109898 °109773 *109566 | 109279 -108914 | “108473 
| 1102833 | *103500 104087 | -104595 *105024 | *105373 
15 | 089807 | °091080 -092291 | -093439 “094522 “095539 
16 073530 | -075141 | 076717 | 078255 “079753 | ‘081208 
17 056661 | 058345 | “060019 | -061683 063333 | 064966 
18 041237 | ‘042786 | 044348 | 045920 | -047500 | 049086 
| 19 -028432 -029725 | 031043 | 032385 | -033750 | -035135 
| 20 | 018623 | -019619 | 020644 | 021698 022781 “023892 
21 | O11617 | 012332 | 013074 | 013846 014645 | 015473 
22 | -006917 | 007399 | ‘007904 | 008433 “008987 | 009565 
23 | 003940 | -004246 | 004571 | 004913 | -005275 | 005656 
24 | 002151 | 002336 | ‘002533 | -002743 002967 | ‘003205 
| 95 | -001127 | 001233 | ‘001348 | 001470 -001602 001744 
26 | 000568 | 000626 | 000689 | 000758 | 000832 | 000912 
27 | 000275 | 000306 | 000340 | 000376 -000416 | ‘000459 
28 | 000129 -000144 | 000161 | 000180 | 000201 | 000223 
29 | 000058  -000066 | 000074 | -000083 | -000093 | 000105 
30 | «000025 | 000029 | “000033 | -000037 | -000042 | -000047 
31 | 000011  -000012 | 000014 | 000016 | 000018 | ‘000021 
| 32 | 000004 | -000005 000006 | 000007 | “000008 | -000009 
| 83 | 000002  -000002 000002 | 000003 “000003 “000004 
| 34 | 000001 | 000001 -000001 | -000001 | 000001 | -000001 
| = {| — | 900001 
| 14°2 14°3 14°6 
| 
@ | 000001  -000001 000001 000001 000001, — 
| 000011 | -000010 | -000009 | “000008 -000007 :000007 
| 000075  -000069 | ‘000063 | 000058  -000053 | “000049 
3 | 000352  -000325 | 000300 000277 -000256 | “000237 
| 001239 -001153  -001073 000999 -000929 | 000864 
| 003494 003275 003070 002876 “002694 | 002523 
6 | 008212 -007752 | 007316 006902 006510 006139 
016541 | 015726 | -014946 014199 013486 -012804 
8 | 029153 | 027913 | 026715 025559 -024443 023367 
| 9 | 045673 | 044040 042447 040894 039380 ‘037907 
| 10 | -064399 | -062537 | “060700 “058887 “057101 | -055343. 
062547 | “080730 078910 | “077089 075270 073456 


TABLE—(continued). 


13-7 


| 
| 


000001 
000015 | 
000105. 


“000481 
“001648 
“004514 
“010308 
*020173 
“034547 
*052588 
*072046 
*089730 


102441 | 
"107957 | 


“105644 
“096488 
“082618 
“066580 
‘050675 
036539 
*025030 
016329 


O10168 | 
006057 | 


“003457 


001895 | 


“000998 
“000507 


“000002 


> 


“000006 
“000045 
000219 
“000803 
002362 
005787 
012152 
*022330 
036472 
053614 
‘071648 


13°8 
“000001 
“000014 
“000097 
“000445 
“001535 
“004236 
009743 
*019207 
“033132 
“050802 
*O70107 
087953 


101146 | 


*107370 
"105836 
097369 
083981 
068173 
*052266 
037962 
026193 
017213 
‘010797 
006478 
003725 
*002056 
“001091 
*000558 
“000275 
000131 
“000060 
“000027 


| 000012 


*000005 
*000002 


| 000001 


“000006 
“000041 


000202 | 


“000747 
002211 
“005454 
011530 
021331 
035078 
“051915 
“069850 


13°9 


“000001 | 


“000013 


“000089 | * 
| 


001429 


003974 | 


009206 
“018280 | 
*031762 
“049054 
“068185 
“086162 
“099804 
“106713 
“105951 | 
“098181 
"085295 
‘069741 
"053856 
| 039400 | 
027383 
“018125 
| 011452 | 
“006921 | 
004008 | 
002229 | 
“001191 
000613 | 
“000305 
*000146 
“000068 | 
“000030 | 
“000013 | 
“000006 
“000002 
000001 


“000005 
“000038 
000186 
000694 | 
‘002069 | 
005138 | 
‘010937 
“020370 
033723 
“050247 
“068062 


140 


017392 


030435 | 


047344 
066282 
084359 


098418 | 


*105989 
“105989 
“098923 


086558 | 


071283 


055442 | 


040852 


028597 | 


019064 | 


012132 
007385 
004308 


002412 | 
001299 
‘000674 | 


000337 
000163 


“000076 | 


“000034 
“000015 
“000006 
“000003 
“000001 


“000005 
“000034 
“000172 
“000645 
“001936 
004839 
“010370 
“019444 
*032407 
-048611 
‘066287 


F 
| 
i | 
Le 0000121 
(000380 
001331 
MO03727 
| 7 
10 
13 
| 
1 
22 
2 4 
000248 
000117 29 
“000053 
“000024 
000 32 
4 ‘000004 35 3 
us 1 
4 
> 
8 
10 


H. E. Soper 35 


TABLE—(continued). 


| 


142 hs | 45 | 146 14:7 | 48 | 149 | 150 


— 


12 096993 095530 | "094034 | ‘092507 °090951 | ‘089371 °087769 086148 | 084510 | 082859 12 
12. -105200  *104349 | -103437 °102469 | -101446 | -100371 -099247 | 098076 | 096862 095607 13 
14 *105951 | *105839 | 105654 “105396 | °105069 | -104672 -104209 | *103681 | *103089 | 102436 14 
15 099594 | *100195 | :100723 | -101181 | *101567 | 101881 *102125 | *102298 | -102402 | *102436 15 
16 087768 ‘088923 | 090021 091063 °092045 | 092967 093827 | ‘094626  -095361 096034 16 
1” 072795 074277 | 075724 °077135 | 078509 | 079842 081133 | 082380 | 083581 084736 17 
18 057023 °058596 | 060158 061708 :063243 -064761 066259 067735 | -069187 | 070613 18 
| 19 042317 | °043793 | 045277 046768 | 048264 -049763 °051263 | 052762 | °054257 | ‘055747 19 | 
20 029834 -031093 | 032373  -033673  -034992 036327 | 037678 039044 040422 | 041810 20 
020031 °021025 | 022045 023090 | 024161 025256 | 026375  °027517 028680 | °029865 21 | 


= 


22 | 012838 °013570 | ‘014329 | 015114 | 015924 °016761  -017623 | 018511 | 019424 | ‘020362 22 
007870 -008378 008909 | -009462 | 010039 010640 | ‘011264 011911 012584 | 013280 23 | 
4 *004624 004957 | 005308 | ‘005677 | 006065 006472 006899 *007345 *007812 | 008300 24 
25 002608 °002816 | 003036 | 003270 -003780 | 004057 004348  -004656 | 004980 25 
24 001414 *001538 | 001670 -001811 001962 -002123 002294 | °002475 002668 | ‘002873 | 26 | 
| 
| 


000739 -000809 | 000884 | 000966 | 001054 *001148 001249 | 001357 001473 | °001596 | 27 | 
28 000372 000410 | 000452 | 000497 ‘000546 00598 -000656 000717 | 000855 28 
29 +000181 +000201 | 000223 | -000247 000273 -000301 -000332 000366 000403 | (000442 29 
30 000085 *000095 000106 -000118 -000132 000147 | 000163 -000181 “000200 | -000221 30 
-000039 *000044 | 000049 | 000055 000062 -000069 -000077 -000086 000096 | 000107 3i | 


32 000017 *000019 -000022 | -000025 000028 | -000032 —*000035 000040 | 000045 | 000050 32 | 
33 000007 *000008 -000011 | 000012 000014 -000016 -000018 “000020 | 000023 33 
34 000003 000003 | 000004 -000005  -000005 000006 -000007 000008 | 000009 | 000010 34 
35 | 000001 000001 -000002 -000002 | -000002  -000002 -000003 | “000003 | -000004 | 000004 35 
‘ 36 ‘000001 | 000001 -000001 | *000001 —-000001 000001 2 | 000002 36 
q 37 — --- “000001 | -000001 | °000001 |. 37 
4 
3 
4 
5—2 
4 


ON THE POISSON LAW OF SMALL NUMBERS. 


By LUCY WHITAKER, BSc. 


PART I. THEORY AND APPLICATION ‘TO CELL-FREQUENCIES. 


(1) Introductory. 


Let p denote the probability of the happening of a certain event A, and 
q=1-—p, the probability of its failure in one trial. Then it is well known that 
the distribution of the frequencies of occurrence n, n — 1, n — 2 ... times in a series 
N of n trials is given by the terms of the point binomial 


The fitting of point-binomials plotted on an elementary base ¢ to observed 
frequency distributions has been discussed by Pearson*, and he has indicated that, 
if c be unknown, the problem can be solved in terms of the three moment coefficients 
Hs, Tequired to find c, p and n. In actual practice but few cases of frequency 
ean be found which are describable in terms of a point-binomial, and of these few 
a considerable section have n negative, p greater than unity and q negative; thus 


defying at present interpretation, however well they may serve as an analytical 
expression of the frequency. 


The hypothesis made in deducing the binomial (p+ q)” as a description of 
frequency is clearly that each trial shall be absolutely independent of those which 
precede it. In this respect it may be said that binomial frequencies belong to the 
teetotum class of chances, and not to those of card-drawings, when each drawing 
is unreplaced. In the latter case the “contributory cause groups are not inde- 


pendent,” and our series corresponds to the hypergeometrical rather than to the 
binomial type of progression t. 


Using the customary notation 8, = = p4/ the binomial is determined 
from : 


(ii). 
pq =4 (3 + B,)|(6 — + 


* “Skew Variation in Homogeneous Material,” Phil. Trans. Vol. 186, A, p. 347, 1895. 
+ Phil. Trans. Vol. 186, A, p. 381, 1895. 


| 
{ 
i 


Lucy WuitaKER 37 


In order that n should be positive, it is needful that 
=4(6— 28, + 28,), 


should be positive. If this is satisfied clearly ¢ will be real because A, is always 
positive. Further then 


6—28,+ 2p, 

= 1 _ 

Pq=P(l—P)=4 +38, 
is always less than a quarter and p and q will therefore be real. If the reader 
will turn to Rhind’s diagram, Biometrika, Vol. vi. p. 131, he will see that the line 
3—8.+8,=0 cuts off all curves of Types III, IV, V and VI, and includes a 
portion only of Type I, with a part of its U and J varieties. The binomial 
description of frequency, therefore, is not—considering our experience of frequency 

distributions—likely to be of very universal application. 

(2) Further Limitations. 

Now let us still further limit our binomial by supposing : ; 

(i) that the unit of grouping of the observed frequencies corresponds to the 
actual binomial base unit c and (ii) that the first of the observed frequencies 
corresponds to the term Np” of the binomial*. 

In this case the mean m of the observed frequency measured from the first 
term of the frequency will be equal to the ng of the binomial and the standard 
deviation of the observed distribution will be equal to Vnpg. We have thus: 

and n and q will both be negative, if m be less than o*. The condition for a 
positive binomial is therefore that o be less than /m. ; 

(3) Probable errors of the constants of a Binomial Frequency. 

It is desirable to find the probable errors of p and as determined by these 
formulae. We have: 

= nq, = 
Su,’ = qdn+ndq, Spy. = pqdn + ngdp + npdq, 
assuming deviations may be represented by differentials. 
Hence, since dp = — dq: 

Su. —(p — = and pdp,’ — du. =ngdq. 
Square each of these results, sum for all samples and divide by the number of 
samples, and we have: 

* The exact nature of these limitations must be fully appreciated. The best fitting binomial to 
a given frequency distribution will usually be far from one in which the first term of the binomial 
corresponds to the first observed frequency. The modes of the binomial and the observed frequency 


will closely correspond, but the “tails” of the binomial may be quite insignificant and correspond to no 
observed frequencies. 


; 
4 


38 On the Poisson Law of Small Numbers 


Now a,, is the standard deviation of variations in yw. and therefore 


Similarly o,, is the standard deviation of variations in the mean and therefore 


= Lastly the product Measures the correlation between 


deviations in and and is known to be 


Thus we have: 


1 
“WV — + (p — 9)" — 2 (Pp — 9) 


1 
= — + — 2p pe}. 


Butt by = npg {1 + 3(n—2) pq}, 
= npq(p- 4) = npq 
Whence after some purely algebraical reductions we deduce : 
2 (1 5) 2 (1 -) (v), 


Formulae (v) and (vi) are very important; they enable us to obtain the 
probable errors for n and p when a binomial limited in the present manner is 
fitted to a frequency distributiont. 


We see at once, that as n grows large and q grows small 
=, approaches the limit V2/N, 


or the probable error, ‘67449 V2/N, of p and q is finite. But o* being finite o, 
becomes infinitely great, or the probable error of » indefinitely large. Thus when 
the n of the binomial is very large, g being very small, the probable error of its 
determination is so great that its actual value is not capable of being found 
accurately. Again, suppose V embraced 200 observations, the probable error of q 
would be of the order ‘07; if N corresponded to only eighteen observations, then 
the probable error of q would be of the order ‘22. It is clearly wholly impossible 

* Biometrika, Vol. u. ‘‘On the Probable errors of Frequency Consiants,” see p. 275 (iv), p. 276 (vii), 
and p. 279 (xii). 

+ Phil. Trans. Vol. 186, A, p. 347, 1895. 


t~ There is no difficulty in obtaining the probable errors of n and p from the more general values 
in (ii). In this case 


The values of og,» Fg, and "g,g, for different values of 8, and 8, have been tabled by Rhind, Biometrika, 
Vol. vit. pp. 136—141. 


: 

4 

| 

| 

2 

4 

a 

| 

| 

= 


Lucy WHITAKER 39 


from series of observations even of the order 200, much less of order 18, to assert 
that q is or is not really a “small quantity.” Thus the observed value of q corre- 
sponding to a population of extremely small g might easily show g=°15 to °50!. 


(4) Poisson—Law of Small Numbers. 


A last limitation of the point-binomial is made by supposing the mean m= ng 
to remain finite, but g to be indefinitely small. We write : 


me 


(14H) 


= N(1-y)4 (14+ nearly 
2 3 
= (L+ m+, +5;+ 


Here the successive terms give the frequency of occurrence of 0, 1, 2, 3... 
successes on the basis of each success not being prejudiced by what has previously 
occurred. This is the Law of Small Numbers. It was first published by Poisson 
in 1837*. It was adopted later by Bortkewitsch, who published a small treatise 
expanding by illustrations Poisson’s work+. The same series was deduced later 
by “Student” in ignorance of both Poisson and Bortkewitsch’s papers, when 
dealing with the counts made with a haemacytometert. 


The mean is at m from the first group, the other moments as “ Student” has 
shewn § are: 
=M, py=3m?+m. 
Hence B,=1/m, B,-3=1/m. 
When the mean value is large, 8,, 8, and the higher §’s approach the values 
given by the Gaussian curve. 


Clearly the Poisson-Exponential formula contains only the single constant 
m =p, and its probable error is therefore = 67449 This will, 
if N be reasonably large and m not too big, be a small or at any rate a finite 
quantity (i.e. not like o, for g very small). Hence it might be supposed, although 
erroneously, that the Poisson-Exponential formula was capable of great accuracy 
in addition to its great simplicity. But this is to neglect the fundamental 
assumptions on which it is based, namely: 


(i) that the data actually correspond to a binomial, 
(ii) that in that binomial q is small and n large. 


Clearly (i) shows us that, if we can find the binomial, it will actually be closer 
to the observed frequency than Poisson’s merely approximate formula. 


* Recherches sur la Probabilité des Jugements. Paris, 1837, pp. 205 et seq. 

+ Das Gesetz der kleinen Zahlen, Leipzig, 1898. 

+ “On the Error of Counting with a Haemacytometer,” Biometrika, Vol. v, pp. 351—5, 1907. 
§ They may be deduced at once from (iv). 


8 

e 


40 On the Poisson Law of Small Numbers 


Secondly (ii) can only be justified as an assumption by actually ascertaining 
the form of the binomial from the data and testing whether n is large and q small 
and positive. It appears absurd to base our formula on an approximation to 
a binomial of a particular kind when, on testing in the actual problem, such a 
binomial does not describe the results. As a merely empirical formula, the 
Poisson-Exponential of course can be tested by the usual processes for measuring 
goodness of fit, but no such test nor any discussion of the probable errors of their 
results have been provided by Bortkewitsch himself nor by Mortara, who has 
followed recently his lines in a work to be considered later. As a matter of fact in 
the cases dealt with by Bortkewitsch, by Mortara and by “Student,” n will be found 
almost as frequently small and negative as large and positive, and q takes a great 
variety of values large and negative and large and positive, as well as small 
and positive. Thus the initial assumptions made from which the “law of small 
numbers” is deduced are by no means justified on the material to which it has so 
far been applied. 


(5) Application of the Law of Small Numbers to determine the Probable Errors 
of Small Frequencies. Given a distribution of frequency for a population NV let jig 
be the frequency in the cell of the sth row and ¢th column of a contingency table 
(or if we drop t, 7, would stand for the frequency of any class). Then if we take 
a random sample of N individuals from this population, the chance that an indivi- 


dual is taken out of the fig cell is Tig | N, and that it is not is 1— * Therefore if 
the original population be so large that the withdrawal of an individual does not 
affect the next draw, the frequency of individuals in M random samples of N will 
be given by the terms of the binomial : 


Now, if jiw/N be very small, and W large this will approximate to the 

Poisson series : 
2 3 
Me~™(1 +mt+ tart 

where m=" x NV. But i,/¥ will approximately be the mean proportion of the 
whole in the s¢ cell of the sample itself =ny/N,or m=ng. Thus if in any cell of 
a contingency table, or in any sub-class of a frequency whatsoever, we have a 
frequency nz small as compared to the population V, then in sampling, this small 
frequency will have a distribution approximating to the Poisson Law, and tending 
as nz becomes larger to approach the Gaussian distribution*. It would appear, 


* Such approach is usually assumed when we speak of 
(1-"8 
67449 (1 
as the probable error of the frequency n,,. But such a ‘‘probable error”’ has really no meaning if n, 
be very small and the exponential law be applied. 


| 
| 
4 
| 


Lucy WHITAKER 41 


therefore, that the Poisson Law of Small Numbers should be applied in order to 
deal with the errors of random sampling in any small frequency, and an appeal 


should not be made—as is usually the case—to Sheppard’s Tables on the assump- 
tion that the frequency is Gaussian. 


The following Table I illustrates the results obtained (a) from the Binomial, 
(b) from the Poisson-Exponential and (c) from the normal curve on the two 
hypotheses that (i) the frequency is 10 in the 1000 and (6) is 30 in the 1000. 
But here a word must be said as to which Gaussian is to be compared with the 
Binomial or the Poisson-Exponential. The usual method of fitting a Gaussian is 
to give it the same mean and standard-deviation as the material to which we are 
fitting it. For example, we should compare the Poisson exponential with a Gaussian 
at mean m and with standard-deviation /m, or the point binomial with mean nq 


TABLE I. 


Comparison of Binomial, Poisson-Exponential and Gaussian for cell-frequency 
variations in samples for case of 10 and 30 in a total population of 1000 


Percentage Frequency 


| 10 in 1000 | 30 in 1000 | 
Binomial Gaussian Binomial Gaussian | 
| 
| 0} 00004 00005 | 00132 | 19 | 00848 | -00894 ‘01100 
7| 00044 00045 00327 | 20| ‘01287 | ‘01341 ‘01553 
2} -00020 “00227 00735 | 21| ‘01857 | ‘01916 02118 
00739 00757 01491 | 22| -02556 | 02613 -02792 
4 -01861 01892 | -02736 | 23| -03362 | -03408 “03544 
5 | 03745 ‘03783 04539 | 2 04233 “04260 ‘04373 
6 | 06274 06306 06806 | 25| 05110 | ‘05112 05198 
=| 7| -08999 ‘09080 09224 | 26 | -05927 05898 “05970 
| 8| ‘11982 | -11260 ‘11300 | 27 | -06613 06553 06625 
9| +12561 ‘12511 12514 | 28 | 07107 07021  -07104 
29 | -07367 “07263 | ‘07360 
|10\ ‘12574 12511 “12526 
= 30 | -07375 07263 07367 
A 11| °11431 | °11374 ‘11334 
12| -09516 09478 09271 | 31| ‘07137 07029 | -07126 
13 | -07305 | -07291 06854 | 32| -06684 06590 | -06659 
14 | °05202 ‘05208 04580 | 33| 06064 05991 06013 
15 | -08454 ‘03472 02767 34 | 05334 “05286 “05246 
16 | 02148 -02170 ‘01511 | 35 | 04553 04531 04423 
17 | 01256 01276 00746 | 36 | ‘03775 03776 | 03602 | 
18 | -00693 “00709 00333 «37 -03042 03061 | 02835 | 
19 | 00362 ‘00373 00134 | 38 | 02384 02417 02156 
20| 00179 00187 00049 | 39 | 01819 “01859 01584 | 
21 | -00085 | -00089 00016 | 40 | 01351 01394 01125 
22 00038 | 00040 00005 | 41 | 00979 01020 | -00771 
23| 00016 | 00018 00001 | 42) -00691 | -00729 00511 


Biometrika x 


6 


42 On the Poisson Law of Small Numbers 


and standard-deviation Vnpq. These will, however, not be identical standard 
deviations as p is not truly unity. In ordinary practice, in testing for example the 
30 in 1000 frequency, we should put the centre of our Gaussian at our 30 group, 
and use a standard deviation = V30 (1—30/1000) = V30 x ‘97 = 5°39444 to enter 
the table of the probability integral. This is, of course, the Gaussian we obtain 
by the method of least squares, but to assume that it is “the best” is to argue in 
a circle, because we then take least squares as a test of what is best*. It is 
not the Gaussian which is directly reached by proceeding either to a limit of the 
Binomial or to the Exponential, for example, by applying Stirling’s Theorem. It 
will be seen by examining Table II that the Gaussian curve develops out of the 
exponential by a mode at the point midway between the two equal terms, rather 
than by a mode at the mean, which coincides with the centre of the second of 
them. If we apply Stirling’s Theorem to the term+ 


of the binomial NV (p+q)”" it becomes 


V2Qar 
ie. the ordinate of a Gaussian curve of Standard Deviation Vnpq and mean at 
nqg—4(p—q). These give for the Poisson-Exponential the Gaussian with standard- 
deviation ./m and mean m- 4. The above type of curve which gives frequencies 
by coordinates and not by areas has been termed by Sheppard a ‘spurious curve 
of frequency’; at the same time it is the method by which Laplace and Poisson 
first reached the normal curve, and the real point at issue is whether we shall get 
better approximations to the discontinuous frequencies of the binomials by using 
Gaussian ordinates than by using the areas of a Gaussian curve. At the same 
time it has been shewn} that if a Gaussian curve gives a series of frequencies by 
its areas, then if its standard-deviation be o*, a spurious Gaussian frequency curve 
with standard deviation given by o,?= o* + j,h’, h being the sub-range, will closely 
give the frequencies by its ordinates. It seems probable therefore that the 
Gaussian curve with mean at ng—4(p—gq) and standard deviation Vnpq— +5 
will more closely represent the binomial for cell frequency variation by its areas, 


* There is a further flaw in this treatment—the Gaussian is continuous, the Binomial and the 
Poisson-Exponential are not. If t, be the rth term of either of the latter series, we ought really to 


make 
lz 
r+i-m WN 2 
a minimum by the conditions du/dm=du/dc=0. No complete solution of this problem has hitherto 
been determined. 


+ The final form for u, may be obtained by neglecting the terms in 5 in the formula given by 


- Pearson, Phil. Trans. Vol. 186, A, p. 347, footnote. 


Biometrika, Vol. p. 311. 


7 
|, 
} 


Frequencies (total 1000) 


300; 
250; 
200; 


150; 


100; 


50; 


Biometrika, Vol. X, Part |. 


100; 


150; 
100; 


50; 


Caswusstan. 


10 


0123456 


r 
350 
50 
O 5 = 
= 
30 
5O 
01234567 8 91011121814151617 1810 


219 


Plate VII 


10 


15 


10 15 20 25 30 


3 10° 


50 
0 5 = 20 25 
20 
50 
0 
0 
15 20 25 30 35 | 
G. SCAM. | 
| 
“ail | 
| 20 25 30 35 40 


| 
§ 
tobe 
Wee 
4 


Lucy 43 


than if we apply the ordinary process of mean ng, standard deviation Vnpq, and 
Sheppard’s table for areas to the frequencies. It will be noted that this amounts 
to using Sheppard’s correction on, the crude second-moment and slightly shifting 
the central ordinate towards the side of greater frequency. This is the Gaussian 
curve used in Table I. 


The object of the present section of our work is to indicate how far it is 
legitimate to use the Poisson-Exponentia] up to cell frequencies of the order 30 
in a population of about 1000* and how far we then reach a state of affairs, which 
for practical purposes may be described by ordinary tables of the Gaussian. It 
will be seen from Table I that the Poisson-Exponential even for ny =10 and 30 is 
not extremely divergent from the Binomial. 


In Plate VII the transition of the exponential histograms of frequency towards 
the Gaussian form is indicated for cell-frequency = 1, 5, 10, 15, 20, 25 and 30; in 
the cases of 10 and 30 the corresponding Gaussian curves are drawn. 

it will be seen that with due caution the Poisson-Exponential may be reason- 
ably used up to frequencies of about 30 in the 1000, and that after that it would 
be fairly satisfactory to use the areas of the Gaussian curve as provided in the usual 
tables. 


(6) In order to table the results of the Poisson-Exponential for ‘easy use, it 
seemed desirable to turn them into percentages of excess and defect. For example 
take the distribution for a frequency 5. It is: 


Per cent. of Cases in which: 


0 006,737,945 a defect of 5 occurs : 0674 
1 033,689,725 4 or more 4043 
2 084,224,310 3 or more 12°465 
3 140,373,850 2 or more 26°503 
4 175,467,310 1 or more 44049 
5 175,467,310 the true value ae 17°547 
6 "146,222,755 an excess of 1 or more be 38°404 
7 104,444,825 = 2 or more et 23°782 
8 065,278,015 3 or more 13337 
9 036,265,564 4 or more 6809 
10 018,132,782 5 or more 3183 
11 008,242,173 6 or more 1:370 
12 003,434,238 7 or more 0°545 
13 001,320,860 8 or more 0°202 


* Of course in the Poisson-Exponential itself the total frequency plays no part; it is only useful in 


testing the validity of the approximation. 
6—2 


| 
| 


44 On the Poisson Law of Small Numbers 


Thus we see that if the true value of the frequency be 5 for the average sample, 
it will only lie outside the range 1 to 10 in ‘674 + 1370 = 2:044 cases per cent., or 
the odds are 49 to 1 that the value found will be from 1 to 10. 


On the other hand it will lie outside the range 2 to 8 in 4043+ 6°809=10°852 °/, 
of cases, or once in about 9 trials the frequency will lie outside this range. Or, 


again, once in about every four trials (25°8°/,) the result will fall outside the 
range 3 to 7. 


On the other hand if we write «= V5 (1 — = 223047, we have — 4°5 
and +5°'5 as the deviations from a mean 5 of all beyond 0°5 and above 1055, 
giving 2/o=—2-0175 and + 2°4658 respectively. These cut off tail areas of 
02181 and ‘00684, respectively. Thus in 2°865—not 2°044—per cent. of cases 
we should assert that the frequency would lie outside the range 1 to 10, or the 
odds that it would lie inside this range are now only about 34 to 1, not 49 to 1. 
Calculated from the Gaussian the frequencies outside ranges 2 to 8 and 3 to 7 
correspond to 10°1°/, and 26:2°/, of the trials instead of 10°9°/, and 25°8°/,. If 
we take for the standard-deviation of our Gaussian Vnpq — y= 2°21171, we find 
that the odds in the first case are still only 35 to 1, but the percentages in the 
other two cases are 11°3 and 25'8. 


It will be clear that near the centre of the curve—especially when we equalise 
the excess and defect of the Gaussian by taking equal ranges on both sides—it 
does not give bad percentages of frequency, but that it does not lend itself to 
the accurate determination of the range for reasonable working odds such as 
50 to 1. 


It will be noted that the total area in excess and defect of 2 and more 
= 23°782 + 26°503 = 50:285, or corresponds very nearly to the “probable error.” 
Actually the Gaussians with standard deviations of 2°23047 and 2°21171 give 
probable errors of 1°504 and 1-492 respectively, so that the Gaussian with 1°5 as 
the probable error is very nearly accurate. 


Table II gives the Poisson-Exponential; it will enable the reader to appreciate 
the range of probable variation in small frequencies. Thus we realise that in 
37°/, of cases in which the true frequency is 1, the cell will be found empty ; 
in 135 per cent. of cases it will be empty when the actual frequency is 2, and in 
5°/, of cases when the frequency is 3 and in 1°8°/, when the frequency is 4. These 
results indicate how rash it is to assume that a sample 4-fold table with one zero 
quadrant signifies perfect dependence or association in the attributes of the 
material sampled. The second line below gives the percentages of cases that 0 
would appear in a cell when the actual number to be expected is that in the first 
line calculated from Table II on the usual theory of a priori probabilities: 


Actual 0 1 2 3 | 8 | 9&over 


Percentage ... | 63°21 23°25 | 8:55 | 3:15 | 1-16 0-43 0-16 | 0-06 0°02 | 0-01 


q 
. 


| 


Lucy WHITAKER 45 
TABLE II. 
Table of Poisson-Exponential for Cell Frequencies 1 to 30. 
Cell Frequencies 
| 
2 | 1 2 3 4 5 6 7 8 9 10 | 
| 
8 20 | | 
19 | 
18 | } | 
| | | 
| 16 | | | 
1 | | | | 
2 1h | 
13 | 
| 12 | | 
| 11 | 
| 
3.8 10 | “005 
| 8 | 034 | 123 ‘277 | 
ag] 7 | ‘091 “302 623 | 1-033 
6 | ‘248 ‘730 | 1:375 | 2-193 | 2-995 
pe 5 | 674 | 1°735 | 2°964 | 4-238 | 5-496 | 6-708 | 
5 4 | ————| 1°832 | 4-043 | 6-197 | 8-177 | 9-963 | 11°569 | 13-014 
3 | | | 12°465 | 15°120 | 17-299 | 19-124 | 20°678 | 22-022 
és 2 13°534 | 19°915 | 23°810 | 26-503 | 28-506 | 30-071 | 31-337 | 32-390 | 33-282 
1 | 36-788 | 40°601 | 42°319 | 43-347 | 44-049 | 44-568 | 44-971 | 45-296 | 45-565 | 45-793 | 
| 
Actual] 36°788 | 27-067 | 22°404 | 19°537 | 17°547 | 16-062 | 14-900 | 13°959 | 13°176 | 12°511 
| 
1 | 26:424 | 32°332 | 35°277 | 37-116 | 38-404 | 39-370 | 40-129 | 40-745 | 41-259 | 41-696 
8 2 | 8030 | 14-288 | 18-474 | 21-487 | 23-782 | 25-602 | 27-091 | 28:338 | 29-401 | 30-323 
3 | | 5-265 8392 | 11-067 | 13°337 | 15-276 | 16-950 | 18-411 | 19-699 | 20-845 
4 ‘366 | | 3°351 | | 6-809 | 8-392 | 9°852 | 11-192 | 12-492 | 13-554 
5 ‘059 ‘453 | | 3-183 | 4-262 | 5°335 | | 7°385 | 8-346 
6 ‘008 ‘110 “380 ‘813° | 2-009 | 2-700 | 3-418 | 4:146 | 4:875 
7 ‘001 “024 ‘110 284 | 1:281 | 1-726 | 2-203 | 2-705 
= 8 000 -005 “029 092-202 “363 ‘572 ‘823 | 1110 | 
9 ‘001 ‘007 ‘027 | 140 ‘241 ‘372 | ‘719 
| 10 | “000 “002 008 | 023 051 096 ‘159 | "242 ‘346 
11 — | — | 002 ‘007 ‘018 036 | 065 | “105 “160 
| | 12 001 002 006 013 “025 “044 ‘O71 
| Be | 13 “000 001 002 005 | ‘017 “030 
— | 000 | 001 | 002) 003 | 007 | 013 
ink “000 ‘001 001 002 “006 
| 17 = — | — “000 001 
| 28 —- | — | = -- 
5 | 27 | _ _ _ _ _ 


| 
| 
4 
i 
7} 


46 On the Poisson Law of Small Numbers 
TABLE Il—(continued). 
Cell Frequencies 
| v | 1 12 13 14 15 16 17 18 19 20 
29 
| | | 
20 | 
= 
| 16 -000 000 001 002 
sal ‘000 000 ‘001 “002 “004 7 
Ba) | oc | | | 004! 008; 015 | -026 
| | “000 ‘001 004 009 ‘018 "032 “052 078 
| ‘001 003 009 021 ‘040 ‘067 "104 ‘151 
|. 008 008 022 047 086 “138 ‘289 ‘387 “500 
| 10 052 "105 ‘181 ‘279 “401 543 “706 ‘886 | 1-081 
9 ‘121 ‘229 “374 ‘553 763 | 1:000 |  1°538 | 1°832 | 27139 
8 | 492 ‘760 | 1:073 | 1:423 | | 2°199 | 2-612 | 3°037 | 3-467 | 3°901 
7 | | 2°034 | 2°589 | 3°162 | 3°745 | 4:330 | 4-912 | 5-489 | 6-056 | 6°613 
6 | 3°752.| 4:582 | 5-403 | 6:206 | 6°985 | 7-740 | 8-467 | 9°167 | 9-840 | 10-486 
5 | 7861 | 8-950 | | 10°940 | 11-846 | 12-699 | 13-502 | 14:260 | 14°975 | 15°651 
4 | 14319 | 15503 | 16°581 | 17-568 | 18-475 | 19°312 | 20-087 | 20°808 | 21-479 | 22-107 
- 3 | 23-198 | 24-239 | 25-168 | 26-004 | 26-761 | 27-451 | 28-084 | 28-665 | 29-203 | 29-703 
Fo 2 | 34°051 | 34-723 | 35°317 | 35°846 | 36°322 | 36-753 | 37°146 | 37°505 | 37°836 | 38-142 
1 | 45°989 | 46°150 | 46°311 | 46-445 | 46-565 | 46-674 | 46°774 | 46°865 | 46-948 | 47-026 
| | 
Actual, 11°938 | 11-437 | 10-994 | 10°599 | 10-244 | 9-922 | 9°629 | 9°360 | 9-112 | 8-884 
| | | | | 
» | I | 42°073 | 42-404 | 42-695 | 42-956 | 43°191 | 43-404 | 43-597 | 43-776 | 43-939 | 44-091 
8 2 | 31:130 31-846 | 32-486 | 33-064 | 33°588 | 34-066 | 34°503 | 34°909 | 35-283 | 35-630 
3 | 21°871 22-798 | 23°639 | 24-408 | 25-114 25-765 | 26-367 | 26-928 | 27-451 | 27-939 
4 | 14596 15°559 16-450 | 17-280 | 18-053 18°776 | 19°451 | 20°088 | 20-686 | 21°251 
5 | 9261 | 10°129 | 10-953 | 11°736 | 12°478 | 13-184 | 13-852 | 14-491 | 15-099 | 15-677 | 
E 6 | 5593 | 6297 | 6-983 | 7650 | 8-297 8-923 | 9°526 | 10°111 | 10°675 | 11-219 | 
| 7 | 3219 | 3-742 | 4-266 | 4:791 | 5°311 | 5-825 | 6-329 | 6°826 | 7°313 | 7-789 | 
x | 8 | 1°769 | 2°198 | 2°501 | 2-884 | 3-275 | 3-669 | 4-064 | 4:46] | 4°856 | 5°248 | 
| 9 ‘929 | 1°160 | 1-407 | 1°671 1°947 | 2°232 | 2°523 | 2°824 | 3-127 | 3-433 
10 ‘467 ‘607 ‘762 | | 1°117 | | 1°516 | 1-732 | 1-954 | 27182 
"225 “305 396 | ‘619 | ‘882 | 1:030 | 1°185 | 1°348 
12 104 148 ‘201 | ‘331 | “497 “595 ‘699 | 
ee | 13 ‘O47 ‘069 ‘097 131 ‘172 | 219 272 ‘333 “400 473 
Sa) “020 ‘031 046 063 086 | ‘114 144 "182 ‘223 269 
| 16 ‘008 ‘014 ‘021 “030 042 | 057 ‘074 096 121 149 
16 ‘003 “006 “009 ‘013 ‘020 | 028 ‘036 “050 064 | ‘081 
‘001 ‘002 “004 ‘006 ‘009 ‘014 ‘O17 “025 ‘033 | 042 
‘001 ‘000 “002 ‘002 ‘004 ‘006 008 | 012 ‘O17 “022 
| 19 | 000 ‘000 ‘001 ‘001 002 “003 ‘003 | 006 ‘008 ‘O11 
| 20 ‘001 “000 ‘001 “002 ‘002 | 008 “004 ‘005 
| 21 sade 000 | 000 | 000 ‘O01 001 002 | + 002 | 003 
| — | — | “000 “000 ‘001 ‘001 ‘001 
| | 
| 


| 
} 
ips 
| 
| 


Lucy WHITAKER 47 
TABLE Il—(continued). 
Cell Frequencies 
£ 21 22 23 2h 25 26 n 28 29 30 
| 4 | 
| & 20 = ‘000 000 ‘001 001 001 | +002 
19 000 “000 ‘001 ‘001 002 | 008 ‘004 006 
2p 18 000 001 ‘001 002 004 006 | 009 012 017 
| 17 001 002 003 “005 008 ‘O11 016 | 031 041 
16 003 006 010 015 022 ‘031 043 | 056 073 092 
15 012 020 “030 043 059 078 “129 160 “195 
1b 039 058 081 “109 142 “180 “224 "273 | 328 ‘387 
| ‘111 “150 252 314 | “384 “460 543 | 
|g | 12 277 “B55 “443 540 647 ‘762 | 884 1-014 | 1-151 | 1-293 
"625 763 ‘912 | | | 1:417 | | 1-791 | 1-987 | 2-187 
e% | 10 1-290 | 1°512 | 1-743 | 1:983 | 2-229 | 2-482 | 2-739 | 3-000 | 3-263 | 3-598 
9 2°455 | 2°778 | | 3:440 | 3:775 | 4:111 | 4-446 | 4-781 | 5-114 | 5°444 
£5 8 4°336 | 4°769 | 5-200 | 5626 | 6-048 | 6-463 | 6°872 | 7-274 | 7°669 | 8-057 
7 7157 | 7°689 | 8-208 | 8-713 | 9-204 | | 10°147 | 10°599 | 11-038 | 11-465 
6 | 11°107 | 11-704 | 12°277 | 12°827 | 13°358 | 13-867 | 14°357 | 14°830 | 15-285 | 15-724 
» 5 | 16292 | 16-900 | 17-477 | 18-025 | 18-549 | 19°048 | 19-525 | 19-981 | 20-417 | 20-836 
4 | 22°696 | 23-250 | 23-771 | 24-263 | 24°730 | 25:172 | 25-591 | 25-990 | 26°371 | 26-734 
= 3 | 30°168 | 30-603 | 31-010 | 31°391 | 31-753 | 32-094 | 32-416 | 32-721 | 33-011 | 33-287 
2 | 38-426 | 38-691 | 38-938 | 39°168 | 39-387 | 39°593 | 39-786 | 39-970 | 40 143 | 40-308 
1 | 47-097 | 47°164 | 47-226 | 47-283 | 47°340 | 47-392 | 47-440 | 47-486 | 47-530 | 47-572 
Actual] 8°671 | 8:473 | | 8-115 | 7:952 | 7-799 | 7-654 | 7°517 | 7°387 7:264 
4 1 | 44:232 | 44:363 | 44-485 | 44-603 | 44°708 | 44°810 | 44-906 | 44-997 | 45-083 45°165 
3 2 | 35°955 | 36-258 | 36-542 | 36-812 | 37-:062 | 37-299 | 37°525 | 37-739 | 37-942 | 38-135 
| | 28-397 | 28-828 | 29-235 | 29-620 | 29-982 | 30-326 30°653 | 30°965 | 31:262 31°546 
= | 4 | 21-785 | 22-290 | 22-770 | 23-227 | 23-660 | 24-074 | 24-469 | 24-847 | 25-208 | 25°555 
| 16-230 | 16°758 | 17-264 | 17-748 | 18-211 | 18-655 | 19-083 | 19-493 | 19°888 20-269 
= | 6 | 11-744 | 12-251 | 12-740 | 13-213 | 13-669 | 14-110 | 14-538 | 14-951 | 15-351 15-738 
7 8°254 | 8°709 | | 9585 | 10-007 | 10-418 | 10-819 | 11-210 | 11°591 11-962 
8 5-637 | 6022 | 6-402 | 6777 | 7-146 | 7:509 | 7°866 | 8-218 | 8°562 | 8-901 
9 3°742 | 4°052 | 4:362 | 4:670 | 4:978 | 5:284 | 5°588 | 5:890 | 6°188 | 6:484 
20 2°415 | | 2895 | 3°138 | 3°385 | 3-632 | 3°880 | 4:129 | 4:377 4°625 
a 11 1517 | 1°692 | 1°873 | 2°057 | 2-246 | 2-438 | 2-633 | 2°831 | 3-030 | 3-230 
| 12 927 | 1:051 | 1°181 | | 1:456  1°599 | 1-747 | 1-899 | 2-053 | 2°210 
5S | 13 “552 637 “727 821 ‘921 1025 | 1°134 | | 1°362 | 1-481 
14 “320 “376 437 5 57 643 720 | 885 | 
181 217 256 345 394 “448 | “504 564 626 
22) 16 “100 147 ‘173 204 “237 272 | “311 352 | 
| 17 054 082 098 118 "139 162 “188 215 | 
18 -028 036 045 “055 067 “080 095 | 129 | 149 
19 ‘O15 019 024 -030 037 045 054 | 065 076 | 
$ 20 010 013 016 020 "025 031 | 044 | 052 
21 004 “005 007 oll ‘014 017 021 025 | 030 
5 22 002 002 004 006 009 | ‘O11 014 | 017 
8 23 001 001 002 003 003 004 005 007 | 010 
2 001 001 001 -002 002 002 003 | 003 004 006 
= 25 000 000 001 ‘001 001 001 001 002 002 | 003 
26 = 000 000 001 001 | 001 001 | :002 | 


| 


On the Poisson Law of Small Numbers 


PART II. CRITICISMS OF PREVIOUS APPLICATIONS OF 
POISSON’S LAW OF SMALL NUMBERS. 


(7) We now turn to the illustrations which various authors have given of 
the Law of Small Numbers. : 


“Student's” Cases. We take first the series given by “Student” in his memoir 
on counting with a Haemacytometer*. They are of special importance because 
the series at first appear of fairly adequate size, namely consisting of 400 
individuals, and further we should anticipate that the Law of Small Numbers 
would hold in his cases. He obtains better fits with the binomial than with the 
exponential but, as he remarks, he has one more constant at his disposal. On the 
other hand, if the exponential be a true approximation, the binomial ought, to come 


out with a large n and a small but positive g. “Student” finds for his four 
series : 


400 x (1°1893 — 
400 x (97051 + °02949)#2™, 
400 x (1:0889 — 
400 x (9525 + 0475), 

II. and IV. may, perhaps, be held fairly to satisfy the conditions, although it 
is not certain if 46 is to be considered a large n or ‘05 a very small g. 

I. and III. fail to satisfy the conditions at all, unless the probable errors of q 
and m are such that qg might really be a small positive quantity and n really large 
and positive. The following are the values for the four series of n and q and their 
probable errors : 

= — ‘1893 + 0647, n=— 360544 12209. 
q = +0295 + 0457, n= 46°2084 + 71°7373. 
q = — ‘0889 + 0534, n= — 20°2473 + 12°1165. 
=+ 0475 + 0452, n= 985263 + 93-7494. 

Now while these results are very satisfactory for II. and IV., they are not 
wholly conclusive for I. and III. We can approach the matter from another 
standpoint; the probable error of g for p=1 is 
VN 
in “Student’s” cases. Thus the deviation of q from q a very small quantity is for 
I. 268 times the S. D., and for III. 1:26 times the S.D. Since g may be either 
positive or negative, we may reasonably apply the probability tables and the odds 
against deviations occurring as great as these are in one trial about 250 to 1 and 


9 to 1 respectively. Hence in four trials we should still have large odds against 
their combined appearance. 
* Biometrika, Vol. v. p. 356. 


67449 /2 = 67449 x 0707 


| 48 } 
| 


Lucy WHITAKER 49 


We have said that the results for II. and IV. are fairly satisfactory, i.e. we 
mean that they are consistent with g being small and positive and n being large ; 
but of course they are also consistent with g being negative and n being small and 
negative. 

It will be obvious from these results for “Student’s” data that it is extremely 
difficult to test the legitimacy of the hypothesis on which the “Law of Small 
Numbers” is based. In none of the cases dealt with by Bortkewitsch, much less 
in those dealt with by Mortara, are the populations (NV) anything like as extensive 
as those considered by “ Student.” But populations of even 400 give, as we see, too 
large values of the probable errors of g and n for us to be certain of our conclusions. 


(8) Bortkewitsch’s Cases. Taking Bortkewitsch next, he deals with the 
following cases : 


I. Suicides of Children in Prussia for 25 years: (a) Boys, (6) Girls, 25 cases. 


II. Suicides of Women in eight German States for 14 years: 112 cases or 
8 subseries of 14. 


III. Accidental Deaths in 11 Trade Societies in 9 years: 99 cases, or 11 sub- 
series of 9. 


IV. Deaths from the Kick of a Horse in 14 Prussian Army Corps for 20 years: 
280, or, as Bortkewitsch, 200 cases. 

It will be noted at once that Bortkewitsch’s populations (J) are far too small 
for any effective determination of the legitimacy of his application of Poisson’s 
formula to his data. 

We take his cases in order: 

I. (a) Suicides of Boys. 

TABLE III. 


Number of Suicides... 2 | 3 | 3 | and over 


Number of Years 


The binomial is: 
25 [1:2033 — -2033]-°™™. 
Mean 1:9600 and p, = 3°2584. 
We have = — ‘2033 +2421, n=—9°6425 + 10°9416. 

If g were really zero its probable error would be + °1908. Clearly 25 cases are 
wholly inadequate to test the legitimacy of applying the Poisson-Exponential to the 
frequency*. But to what extent is the reader made conscious by Bortkewitsch 
that his cases fail entirely to demonstrate the legitimacy of applying his hypotheses ? 


* The x? for the binomial is 2-379 and for the exponential 2-836, showing a somewhat better 
result for the binomial. 


Biometrika x : 7 


50 On the Poisson Law of Small Numbers 


I. (6) Suicides of Girls. 
TABLE IV. 


Number of Suicides 


Number of Years 


The binomial is: 
25 [7418 + 2582p, 
Mean = ‘4400 and pw, = ‘3264. 
We find q ='2582 +1012, n=1°7041 +°7850. 
As in the case of the boys’ suicides, if g were practically zero its probable error 


would be + ‘1908, and there is nothing in this result again to justify us in asserting 
that q is indefinitely small and n indefinitely large. 


Actually we have: 


TABLE V. 
Number of Suicides per Year. 


0 


Actual ... 15 


Bortkewitsch ... 16°1 
Binomial (a)... 15°0 
Binomial (b) ... 15°2 


(a) is the binomial considered above, (b) is the binomial obtained by taking 
n a whole number = 2, and g = mean/2 = 22, i.e. 25 (78 + '22)*. 

It is clear that either (a) or (b) gives better results than the Poisson-Expo- 
nential. Applying the test of goodness to fit, we have 

x’? = "007 for the binomial (a), 
= "610 for Bortkewitsch’s solution. 

Both give P > *60 but the first is much better than the second. 

If both boys and girls are taken together, we find the binomial 

25 (‘9333 + 0667). 

This is the nearest approach to a small g and big 7 we have so far found—ze. the 
nearest approach so far to an exponential, but it is reached by a process, i.e. that of 
adding together two series of entirely different means and variabilities in a manner 
which cannot be justified, for Bortkewitsch’s hypothesis depends essentially on the 


homogeneity of his material. Even here the fit of the point binomial is slightly 
better than that of the exponential. 


: 
71 18 | — 
8°9 
8°7 Vl 
| 


Lucy WHITAKER 51 


II. Suicides of Women in Eight German States. Bortkewitsch gives the 
following table : 


“TABLE VI. 


Number of Suicides of Women per Year 
State 


7| 38 9 | 10 


Schaumburg-Lippe 
| (b) Waldeck... 
| (e) Liibeck 
(d) Reuss it. L. ... 
| (f) Schwarzburg-Rudolstadt ... 
| (g) Mecklenburg-Strelitz 
| (h) Schwarzburg-Sonderhausen 


| 


| 
| | | 
wel | 


Totals 


@ 
bo 
| 
o 
w 


The resulting binomials 


(a) 14( 9714+ 
(b) 14( ‘8571 
(c) 14( °5819 + °4181)°™, 
(d) 14 (1:0058 — 
(e) 14(1:3929 — 
(f) 14( 6071 + 
(g) 14(1°5792 — 
(h) 14 (1:6609 — 6609)-*™, 


Thus it will be seen that of the eight binomials only four have a positive gq, 
and of these only one can be said to have a very small g, and even in this case the 
n is not indefinitely large. Of the four negative binomials three have quite 
substantial y’s, and the fourth with its small negative q corresponds most closely 
to the Poisson-Exponential. The probable error. of gq for g=0 is +°2549. The 
number, 14, of cases taken is therefore wholly inadequate to test whether the 
Poisson-Exponential may be applied to these data. The mean value of q is 
negative and = —°0820 + ‘0901, and the standard deviation of g=°3928 + ‘0637, 
which are within the limits of random sampling of g =0 with a standard deviation 
of 3779. We shall return to a different manner of considering the point later. 
At present we wish only to indicate that the hypothesis is that q is a very small 
positive quantity and that data which give q a standard deviation of ‘3928, or in 
the next example of -4714 are really inadequate to test such a hypothesis ; for in 
the resulting binomials q may easily lie anywhere between +°8 and — ‘8, and it 
is not possible to demonstrate that its real value is practically an exceeding small 
positive quantity. 

7—2 


i 

= Totals . 

| 0 
14 
1 | 4] 1] 14 
1 | 3| je 3 | 14 
1 | 3] 2 | 14 
2! 3 3 | 14 
14 
1 | 14 

| 

| | 17 | 20 | 15 | 11 | | 

are: 


52 On the Poisson Law of Small Numbers 


III. Accidental Deaths in 11 Trade Societies. 
from which the following table is deduced: 


Bortkewitsch provides data 


TABLE VII. 


(40) 9 (20342 — 1:0342)27™, 
(42) 9( ‘9322+ -0678)"™, 
(55) 9( 6154+ °3846)" 27, 


Of these eleven binomials seven have a positive g; only one of these (23) 
actually corresponds to a really small q and large m, although a second, (42), 
approximates to this condition. In the five other cases the q’s are quite sub- 
stantial; in (13) the q is larger than p. Of the four negative q’s none can be said 
to be so small and the n so large as to suggest that they really correspond to the 
Poisson-Exponential. The probable error of q for q=0 is, however, + 3180, and 
thus for such small series, no test whatever can be really reached of the legitimacy of 
applying the Poisson-Exponential to such data. We may note, indeed, that seven 
of the eleven values of g exceed the probable error and two of these are more than 
three times the probable errcr. We should only expect two negative values of q¢ 
as great or greater than ‘9227 in 80 trials, whereas two have occurred in 9 trials, 


o|1/2 8, 11|12| 13| 14 
20 1] 8} 2} 1} 9 
41 9 | 
| Totals 7] 7] 99 | 
The resulting binomials are : 
(13) 9( 4914+ 
(14) 9( 61844 -3816)", 
(12) 9 (19227 — -9227)-270, 
(20) 9 (1:1282 — 
(23) 9( 9921+ 
(27) 9( 52294 -4771)™, 
(29) 9 (14130 
(41) 9( 8454-4 +1546), 


| 


‘ 

| 

\ 


Lucy WHITAKER «B38 


so that the odds are considerably against such an experience. The mean value of 
q is —°0469 + ‘0959 and the standard deviation of gq is ‘5127 + ‘0678, both results 
compatible with q indefinitely small and a standard deviation = “4714. The main 
problem, however, of the legitimacy of applying the Poisson-Exponential to such 
series cannot be answered by data involving only total frequencies of 9 to 14 
cases in the individual series. 


Bortkewitsch examines the matter from another standpoint. He clubs the 
results given for each application of the Poisson-Exponential together and 
examines the observed totals against the sums of the calculated totals. Thus 


calculating the 11 Poisson-Exponential series* and adding them together 
Bortkewitsch finds for observed and calculated deaths: 


TABLE VIIL. 
Accidental Deaths in 11 Trade-Societies. 


| | | 
Number of Deaths | 0 2 $ 3 2 \13 & over 


Sums of 11 Exponentials | 3-7 | 9°6 | | 15:2 | 14°3 | 12°3 | 9°8 | | 3" 0-6 

| | 


| Single Binomial 38/955 139 156 | 148 | 12-4 | 9°6 6-9 | 07 


If we attempt to fit a single binomial to the observed line of totals, we obtain : 
m = 43636, = 75849 
leading to the negative binomial : 
99 (1°7382 — -7382)- 591, 
Here: q = — + 1829+, n=— 59111 +1391, 
or the constants are significantly substantial with regard to their probable errors. 
The resulting frequencies are given in the last line of the table above. The reader 


* The values of the means and standard deviations for the eleven societies are : 


m o | m 
7-889 


| m 
| 1/969 6°222 2-485 || 40 2-889 2°424 

| 1:343 || 42 | 4-556 2-061 
2556 2-217 5889 | || 55 | 4333 | 1-633 
4°333 | 2-211 5111 2-079 
All these means are less than 10, which is the limit reached by Bortkewitsch’s Tables for the Poisson- 
Exponential. Bortkewitsch says he has taken the societies for which “the statistics indicated the 
smallest numbers of such accidents.” This is not very clear. It is certain that a society with a mean 
number of accidents =100, if it consisted of 200,000 members, would be more suitable for application 
of the exponential, than one with a mean of 8 if it only contained 10,000 members. Both Bortkewitsch 
and Mortara confine their results to means less than 10, and seem to indicate that “ smallness” has 
been determined by the absolute frequencies, but clearly it is relative frequency with which we have to 
deal. The use of such a term as Das Gesetz der kleinen Zahlen for the Poisson-Exponential seems open 
to serious objection, if it be associated with ‘‘m” an absolutely small number, and not with smallness 
of “¢.” 

+ For qg=0, the probable error would be +-°0959 and accordingly qg is very divergent from the 
Poisson-Exponential value of zero. 


54 On the Poisson Law of Small Numbers 


will be surprised to see how closely the single negative binomial determined by 
two constants gives the same result as the sum of the eleven Poisson-Exponentials 
determined by eleven constants, no one of which is really of any significance for its 
own exponential*, If we apply the condition for “goodness of fit,” y? = 5°83 for 
the single binomial and y*= 5°88 for the sum of the eleven Poisson exponentials, 
leading to P=-950 and P=951 respectively, or the fit with a single negative 
binomial is slightly better than that with eleven exponentials, The two constants 
are significant, the eleven constants have no real significance for their individual 
series, as is demonstrated by the fact that the binomials for these series do not 
approximate to the Poisson-Exponential type. 


We may now consider the previous case of suicides of women from the same 
standpointt. The following are the data as given by Bortkewitsch : 


TABLE IX. 
Suicides of Women in Hight German States. 


Number of Suicides g | 4 | 5 Wa ie g | 81) 9 | 10 & over Totals | 


| Observed Frequencies 3 15 3 112 


2 
Sum of 8 Exponentials 3 | 18° : | 5°6 | 3°6 | 2:1 | 2°0 112 


| Single Binomial... [12-6] 18-4 | 18°8 16-4 | 13-2 99/72 51 | 85 a4) 112 


For the single binomial we have : 
m = 3°4732, o? = 82312, 
leading to: 112 (2:3699 — 1°3699) 25854, 
where q=— 13699 +°1490, n= —2°5354 + ‘3076. 


If q were very small its probable error would be +°0901. The values of g and n 
are quite significant, q is large and negative and n is small and negative. The 
resulting frequencies are given in the last line of the table as “Single Binomial.” 
Turning now to the test of “goodness of fit,” we have for the sum of the 8 ex- 
ponentials 7°957, and for the single binomial y*=7'740, leading to P= 


* If the reader will turn to the first footnote on p. 53 he will note that for nine cases, the standard 
deviations of the means (o/,/9) are roughly about -7 or errors of +1 to +1°5 may easily occur in the 
means. Hence with the possible exception of (13) and (27) the m’s have not significant differences, and 
are not typical of the individual societies. 

+ The values of the means and standard deviations are: 


m | m 
Waldeck | 2214 Schwarzburg-Rudolstadt ... 
| Liibeck ... 2°571 Mecklenburg-Strelitz 5°286 
| Reuss a. L. | 2°648 1631 Schwarzburg-Sonderhausen , 5°642 


The standard deviation of the mean is here o|N 14, or, say, °5. Thus errors of 1 might easily occur 
in the values of m. There are probably significant differences between the first five and the last three 
states, but not between the first five among themselves or the last three among themselves. Thus the 
Poisson-Exponentials, if correct in theory, are not significant for the individual states, 


| 1/995 

1767 

| 2-889 

3-061 

| 

| 


Lucy WutraKER 55 


and °654 respectively. Thus again the single binomial with only two constants 
give a fit slightly better, than the sum of eight exponentials with eight constants. 

Bortkewitsch looking at the .observed frequencies and the sum of 8 or 11 
exponentials—without using any satisfactory test for “goodness of fit’””—assumes 
that the coincidence is so good as to justify his hypothesis. But a better fit can 
be obtained with two instead of 8 or 11 constants by simply using a negative 
binomial. We must note here that Bortkewitsch is using the final coincidence 
merely as justification of the Poisson-Exponential; the total frequency is not 
describable in terms of the 8 or 11 constants as it is in terms of the two, for 
these eight constants are not really significant for his individual eleven trade 
societies or for the suicides in the individual eight states. If he wants to describe 
the total, he has no constants by which he can do it. If, on the other hand, he 
wishes to describe what has occurred in the individual societies or states, we have 
seen that their binomials differ very widely from Poisson-Exponentials. If, lastly, 
no stress be laid on the individual cases as having too large probable errors, but 
only on the general coincidence with total frequencies, then the same coincidence 
would justify us in using a single binomial with two ccnstants only*. It appears 
to us that to properly test the Poisson-Exponential, we need not 9 or 14 instances 
in the individual case, but several hundred instances,—more, indeed, than “Student” 
has taken—and that no proof of the “Law of Small Numbers” can be obtained 
on data such as those of Bortkewitsch or Mortara. 


IV. Deaths from the Kick of a Horse in Prussian Army Corps, omitting four 
Corps with Bortkewitsch. 


Here the results are: 


TABLE X. 


Number of Deaths .. 2 Totals 


| 


| Number of Corps. 109 65 22 200 


Whence : 


='61, 
and the binomial is: 


200 (996,557 + °003,443 
This is the first of Bortkewitsch’s illustrations for which his hypothesis that q is 
small and v large is really justified by his data. For: 
q = 0034 + 0670, 
n= 1771711 + 3449°103. 
The probable error of g for g really zero is + 0674. 


* Of course immensely better general total fits are obtained by using the sums of the actual 8 or 11 
binomials than by the Poisson-Exponential sum or the single binomial, but the results in that case 
involve 16 or 22 non-significant constants. 


| 


56 On the Poisson Law of Small Numbers 


The actual results as given by the binomial and the Poisson-Exponential are : 


TABLE XI. 


Number of Deaths ... | Ws 4 and over 


Observed aks 109 
Binomial 108°6 | 
Exponential ... ni 108°7 | 


Actually if we work to two decimal places in the frequencies we have x? = ‘61 
for both binomial and exponential, or the goodness of fit is practically identical. 


In this case it seemed worth discussing the binomial fit more at length. 
Taking the moment coefficients about the mean we have: 


(i) Mean =ng="6100. 
(ii) fs = npg = 6079. 
(iil) = npg (p — = 590,562, 
(iv) = npq (1 + 3npq — 6pq) = 1°643,373. 
We have already discussed the binomial from (i) and (ii), giving y* for goodness 
of fit = 6096. Using (ii) and (iii) we have for the binomial 


200 (985,739 + 014,261)*", 
giving x” = 665. 
Using (iii) and (iv) we have: 
200 (979,524 + 020,057)°™, 
giving x? = "707. 
Putting : Bo = and B, = 
we have: 8. =(1—6pq)/npq, 8, =(1—4pq)/npy, 
and working from f, and £, we find: 
200 (969,150 + 
and in this case x” = 11286. 


This of course does not give a bad fit, but it is clear that working from the 
lowest moment coefficients, as we might anticipate, gives the best results. 


But if g be the chance of death from the kick of a horse, and n the number of 
men in an army corps, then the binomial should be 


200 (p + q)". 


Now it is obvious that none of the binomials give, by their value of n any 
approach to the real number of men in an army corps, If we start with the 


66°4 | 202 | 4:1 07 
F 66°3 | 202 | 4:1 O'7 
| 
a 


Lucy WHITAKER 57 


number of men n in an army corps as 50,000*, we have ng="61 and g=-000,0122, 
thus reaching the binomial 
200 (999,9878 + -000,0122)™, 
giving as compared against Bortkewitsch : 
Binomial Bortkewitsch 
108°6876 108°6703 
66°3002 66°2889 
20°2213 20°2181 
3 41115 41110 
4 and over ‘7034 
and y? = 608,298 608,318 
or, the slight advantage to the binomial exists but is of no significance. 


Now it seems to us that in this case the use of the exponential is justified for 
the totai frequencies, but as far as describing those frequencies is concerned, it 
gives no better result than the binomial. But as in the other five of Bortke- 
witsch’s cases the Exponential is not justified by the individual series themselves t. 


It is perfectly true that the exponential has a definite theory behind it, and 
is interpretable in terms of that theory, i.e. we must suppose the probability of an 
occurrence very small and the chance of its repetition absolutely identical. But 
is the second of these conditions ever likely to be demonstrable @ priori, or must 


* This supposes that every man in the army corps is equally liable to death from the kick of 
a horse; of course a very arbitrary assumption. 
+ To illustrate the idleness of the application of the Poisson-Exponential even to these data for the 
Prussian Army Corps, we give here the binomials for the whole of the 14 corps. 
Index Number 
of Corps Binomial 
20 (+95 + -05)16-0000 
20 (1°325 — *325)—2-4615 
20 (15667 — -5667)—1-0588 
20 (-9 + -1)6-0000 
20 (-6 + -4)1-0000 
20 (6318 + °3682)1-4938 
20 (1°0912 — -0912)—9-3202 
20 (-9 + 
20 + +35) 1-000 
20 (8115 + 
20 (1-05 — -05)—15-0000 
20 (1+11 — 
20 (1°05 — °05)—24-0000 
20 (1-1 *1)—4-0000 
One seeks in vain through these binomials for any approach to q very small and positive and n very 
large and positive. In no case does n approach the number of men in an army corps, say 50,000, 
or q equal the chance of a death from the kick of a horse, say, ‘0000122! It seems impossible by 
clubbing such equations together to give any satisfactory proof that the Poisson-Exponential really does 
apply to individual cases. In the 20 years involved, there were doubtless great changes in both 
the training and the personnel of each army corps, and the results obtained may be just as much due to 
such causes as to the errors of small samples. 


Biometrika x 8 


H 
| 
a 
i 
~ 


58 On the Poisson Law of Small Numbers » 


not we a posteriori demonstrate it from the data themselves? Child suicide may 
be influenced by example, by environmental conditions in different districts, 
possibly even by meteorological conditions in different years. Again, even in 
different army corps the conditions may be far from uniform, the spirit of the 
corps, the teaching with regard to the handling of horses, the experience of past 
life according to whether the corps is raised in town or rural districts may all tell. 
Even Bortkewitsch before he gets his best fit removes four corps or 80 observations 
from his data. We do not criticise this removal, but even unremoved he says the 
fit of theory with experience leaves “wie man sieht, nichts zu wiinschen iibrig” 
(p. 25). But the binomial is before removal : 
280 (1°085,714 — 

in which q is not very small and is negative, and n is not very large and is not 
positive. It is true that the probable error of g for q insignificant is in this case 
+ ‘0570, but this only shows that the data were insufficient in quantity to 
determine whether the exponential could be applied or not. 


(9) Mortara’s Cases. 


Mortara* in an interesting paper has realised the possibility of repetitions not 
being independent and has discussed a constant Q’, by which he proposes to test 
such influence. This quantity Q should be unity, if the Bortkewitschian hypo- 
thesis can be applied. He then takes 16 or 17 districts with records of 10 years, 
and caleulates the mean number of deaths from some special cause per year, say, 
for each district for those years. If this mean number exceeds 10, he casts out 
that district, presumably on the ground either (i) that such a number is no 
longer small, or (ii) that it differentiates the district from those with lower 
numbers. Thus Bologna with 10°9 deaths by murder is excluded and Bergamo 
with 84 is included, although Q’=1 for both. Bologna with 7:1 deaths from 
smallpox is included, but Pavia with 12°3 is excluded although the Q’ of the 
former is 2°5 and that of the latter 1:7. What method should be employed in 
dealing with the frequency of the excluded districts which may amount to 50 °/, 
of all districts is not discussed. Having thus reduced his available districts, 
Mortara proceeds to apply the exponential to each individual district ; he adds up 
the results for each district and compares his totals with the observed totals. It 
will thus be observed that he fits his exponential to ten observations, and then adds 
together five or more districts to get his totals. We can equally well apply this 
process by fitting a binomial to each 10 observations and then adding up such 
results. But it is quite clear that on the basis of ten observations, it is, owing to the 
large probable errors, wholly impossible to assert, whether a binomial of the kind 
required by the Bortkewitsch-Mortara hypothesis,—i.e, one of very small positive q 
and very large positive n—really is justified. We can illustrate this at once from 
Mortara’s Tables (see his pp. 42 and 45) for deaths from Chronic Alcoholism. The 


* “ Sulle variazioni di frequenza di aleuni fenomeni demografici rari,” Annali di Statistica, Serie v. 
Vol. 1v. pp. 5—81. Roma, 1912. 


Sira 


Pote 


Cati 


Sale 


— 
Cala 
Fog: 
= 
= 
; 
| Cos 
| Bol 
| T 
; 


Lucy WHITAKER 59 


observed numbers, and those deduced from the binomials are given in the 
accompanying table. At the foot are the observed totals, Mortara’s exponential 
totals and the binomial totals. 


TABLE XII. Deaths from Chronic Alcoholism. 


Calabria 
Binomial 
Foggia | Oo. 

5 5 2 M. 
B. 


Siracusa 


Potenza 


Catanzaro 


Salerno 


Cosenza 


A 1°47 | 1°49 | 1°32 | 1°04 
87 -64| 


9°78 | 13°07 | 13-09 11-33 | 9°03 6-81 4°87 2-08 124-70) 
12°75 | 14°82 12-64 9-28 | 6-57 | 4-70 ‘ay 7 
| 


The following are the binomials for the 8 districts out of 16 which Mortara 
has selected. 
Reggio Calabria 10( °7842 + 
Foggia 10( 9609 + 
Siracusa 10 (1:3000 — 
Potenza 10 (15500 — 
Catanzaro 10 (2°7524 — 17524)?" 
Salerno 10 (2°3510 — 
Cosenza 10 (2°5808 — 1:5308)-9 
Bologna 10 (33161 — 


| Q ~ > ~ » | | |44 
0 1 4 6 | 10 11 12 13 | over | 
| 
82 | 2°05) 2°56) 214) 1:34) 67 28) 10° 03) — | — — M. 
1°12) 2°16/ 2°33| 1°85) 1:21] 35) -07| o1/ — |—|—/ — 
41) 1°30) 209 2°23 1-78) 1-14) 61) -11) — | —| —| — 
‘78| 1°61) 1°95| 1°80 1-41] -98 -63| -38| 12 | “06 | 03 | -01/ 01} — |B. 
| } | 
15} 63} 1°32| 1°85) 195/163) 1-14} -17| 07] — 1M. 
1°35] 1:46 1:36 ‘95 75 | ‘37 “31 | 23) -12 08 | “17 B. 
| | | | 
06| 1:35| 172/175 1-49 109| -69| -39| ‘09 -04| -O1/ M. 
40} 1°18) 1°31} 1-27/1°14| -77| 45) 17| 12) B, 
| 1°17) 1°27] 1:23/1:10/ -93| -75| -59| -24/-17/-13| B. | 
| | | | 
1 | — | 
74) -28/-15| M, 
| | -21) -71/B 
| | | 
| 
°16) M. 
| B. | 


60 On the Poisson Law of Small Numbers 


Examining these we see that there are only éwo in which g and n are positive 
and only one in q is small and positive and n moderately large. The probable 
error of qg for 10 observations on the assumption that n is very large and q very 
small is + 3016 and is quite inconsistent with the last four districts being samples 
from exponentially distributed frequencies. The other four districts may or may 
not belong to such frequencies—the data are wholly inadequate to determine 
whether they do or not. Reggio Calabria and Foggia have the lowest Q's, 
ie. O'9 and 1:0. But that six districts out of an already selected eight give 
negative q and a seventh a relative large g and small n suggests the inapplicability 
of the hypothesis adopted. If we seek for “ goodness of fit” of the totals, we find: 


Binomial Exponential 
x? = 25°12 47-92 
P= ‘0336 0000 


Thus the odds against the binomial system are 28 to 1, but the odds against 
the exponential are enormous. It does not seem possible to justify the treatment 
of such data by the use of the Poisson-Exponential. 


Let us turn to a second of Mortara’s illustrations, that of deaths from small- 
pox. He rejects first six out of the 17 districts, the remaining ten are given in 
Table XIII. The districts give the following binomials: 


Venezia 10( ‘9500+ 
Bologna 10( ‘9889+ -‘O111)" 
Treviso 10 ( 2°2000 — 1:2000)-** 
Pavia 10( 1:8000— -8000)-"™ 
Cagliari 10( 45190 — 
Padova 10 ( 36833 — 2°6833)-™4 
Verona 10( 56000 4-6000)-™" 
Brescia 10( 9:9727 — 8-9727)-* 
Bergamo 10( 2°3821— 1:3821)-7™" 
Catanzaro 10 (15°6128 — 14°6128)->™ 
Vicenza 10 3°4854— 2°4854)- 


Out of the eleven cases only two give q small and positive; not a single one 
gives for geanything like the chance of a death from small-pox in the district, nor 
for n anything like the population of the district. There is an increasing divergence 
from the positive binomial as Mortara’s Q’ increases in value. We see that in nine 
cases, however, a negative binomial not the exponential is required to describe the 
frequencies. The probable error of q, for insignificant q is as before + 3016, and 
therefore it is improbable that q is zero in at least 9 out of these 11 districts. 


Examining the totals we find 
Binomial Exponential 
x? = 9°64 570°79 
P= 67 ‘000,000 


Luvy WHITAKER 


TABLE XIII. 
Deaths from-Small-pox (1900—1909). 


| Venezia 
Bologna 
Treviso 

Pavia 

| Cagliari 


Padova 


Brescia 
Bergamo 
| 


Vicenza 


Totals 


| Catanzaro 


| 


3 
“17 
1°28 


39 


19°24 | 


40°25 


} 28 | 11 
| 24°97 | 21°50 
23-70 | 14-05 


1°95 | 1°60 1 
1:02 | 


82 | 


09 | 
65 | 


06-03 
75 3°31 


Observed 
Mortara 
Binomial 


0... 
M. 


| B. 


| M. 


PEO PEO SEO PES PES PES FES FEO 


* 1 at ‘12 or more’ in cases of Brescia and Catanzaro was found to signify 1 at 20 in the case of 


Brescia, and 1 at 27 in case of Catanzaro, if the means were to agree with those given by Mortara. 


61 
| 
| | | | | | 
4-40| 3°71) 1:46) —| —| —| — 
| | ; | | | | | 
407) 3°66) 165, 49) 02) — | — — i | 
| 3-68) 1°65) -49/ 01) -| —|—|—|—| 
| | | | | 
| 368 | 3°68) 184; 61) ol] — | — | — 
518 | 2°36| 1518 -61| -09| -05| —|—| — |B | 
| 4 | 3 | 2); — 1 | | | 
| 217) 87 | 26 | 06) — | —|— | —| — | 
| | | | | | . 
| | 1-23] 2°57) 2-70] 1°89] 04] 01) |—|—| — | 
| 407 | 189 117| “79 | -15| -11| -08| -25 | 
| | -91| 261] 2-09] 1:25] -60| -o3| o1) —|—| — 
| 203| 1-40] 98) -70| -19| -13|-10| | 
| Verona | 4 3 — 1 1 | 1 
| | 218) 2°61) 60) 24) 08| 03) o1)—)— | — 
| 1-22| 201) 221] 1-°82/1-20) -66] -31| -13| -05|-02;—| — | 
| 429| 142] 62] -24 4 17 79 | 
| +20] 1°54] 200] 1-95) 1:52] -99| -55| -27| -12/-04!-02| -O1 | 
| 157| 1°46] 1:23| -98| 74 | 38 | ta! ‘12| | 
| | 
| -20| -79| 154] 2°00) 195/152) -99| 27) -12/-04| 02] -O1 
| #00) 120/ “71/50 38 | 25 | 16-14-12) 104 
| 1:50 | 1:23 -48 | 
| 
= 12 2 6 | at 
| 16°54 | 11°76 | 7°58 | 4°38 | 2°25 1-07| -46 | “16 M. 
| 8:58| 5°78 1-20 | “99 | B. | 
4 | 


62 On the Poisson Law of Small Numbers 


In other words the binomials give a reasonable total fit, the exponentials a 
practically impossible one. 


But there is another question to be asked in such series as those of Mortara: 
What justification is there in cutting off at 10 cases, say of murder? A province 
may have a million inhabitants and, perhaps, 40 murders occur in a year*. Hence 
the binomial is for ten year returns 
24,999 
25,000 * 35,000) 
but this is as close as anything can be desired to the exponential series. It may 
be reasonable to apply a separate series to districts giving 42 and 36°6 murders 
per annum respectively, but it is difficult to see why the latter district should be 
altogether excluded from treatment. If the theory of the binomial be applicable 
at all, then it applies practically as well to districts with 40 murders as to districts 
with 4; for, we need no indefinitely small q to get a closely exponential series. 
If we take the case of deaths by murder, Mortara has retained only 6 out of 16 
provinces, yet his criterion Q’ (see his Table, p. 51) is not more divergent from 
unity for the rejected provinces than for those retained ; the binomials are indeed 


10 x ( 


Reggio Treviso 10( 7000 +°3000)"*" 
Venezia 10( 5619 + -4381)" 
Vicenza 10( 9571 + 
Padova 10( 4+ °5226) 
Pavia 10 (18162 — 8162)" 
Bergamo 10( ‘8857 +°1143)? 


only one of which gives g small and positive and n large. 


The mean Q for the retained provinces is ‘967 with a range from ‘7 to 1°4 and 
for the rejected 1:03 with a range from ‘8 to 1-4. Even if—which is not the case 
—the probability of an individual being murdered were too great for the ex- 
ponential, it ought to follow the binomial, but this, as a rule, it does not do, unless 
we give some wholly new interpretations to g and ; the actual values render the 
theory of the binomial as stated inapplicable. 


(10) Mortara’s Criterion. 


As a matter of fact the only test of whether an exponential will legitimately 
fit a given series or not is to determine the binomial (p+ q)" and ascertain 
whether p is slightly less than unity. But: 


p = npq/ng 
_ (Standard Deviation)’ 


* We assume that each individual is equally likely to be murdered. But if there be a graduated 
probability for murder throughout the community, what right have we to apply Poisson’s series at all? 
The essential basis of the application—equal chance of each individual—is wanting. 


“ak 
4 


Lucy WHITAKER 63 


Now if m, be the number of deaths, say, occurring in any year and there be 
1 years under consideration, then: 


(Standard Deviation)? = —ngy 
or, if we use the form preferred by Bortkewitsch* 


_ 82 (m,— 
_ SP (ms — 2g) 

~ 

This in other notation is Mortara’s Q*, the only criterion he actually uses 
provided by his equation (17 ¢er), p. 18. Thus his Q’, which he says must not 
differ much from 1, is only ./p, and it would be better to use p—which has a 
direct physical meaning—than Mortara’s Q’=,/p. Clearly Mortara’s somewhat 
elaborate process of deducing Q’, does not amount to more than saying: Fit a point 


binomial and test if p is slightly less than unity. We contend that it is best 
straight otf to fit the binomial. 


Hence: 


It is true that Mortara does not reach his Q®, our p, by the simple process of 
asking whether the binomial is one with a positive probability less than unity. 
He endeavours to obtain it by considering whether there is “lumpiness” in the 
observations. But it seems to us clearer and briefer to ask: Are the contributory 
cause-groups independent as in teetotum spinning? If so, the data will fit a true 
binomial and p will of necessity be a positive quantity less than unity. If they 
are not of this character then p must of necessity be greater than unity. It is of 
interest to see how Mortara’s test of dependence of contributory cause groups 
leads to a criterion, but he actually only gets his Q”, ie. our binomial p after 
a series of hypotheses which much limit, and that in no very obvious manner, 


* The use of ,/7 or ,/7-1 in the value of the standard deviation when J is small has been several 
times discussed. It may be dealt with as follows: The probable errors of a mean as deduced by the 
two processes are 


E=-67449 . 
and E’ =-67449 .o/,/1-1, 


“67449 + ) 


1 
67449 (« ) 
Ji 

Now the probable error of o is *67449 Tai’ and Fa is less and often much less than -67449. 
Hence if we only know o from the observations themselves, and this is the usual case, we have: 
where o’ differ from o by a quantity usually far less than the probable error of ¢. In other words the 
refinement of using E’ for FE is idle having regard to the accuracy of our observations; and the form 
used by Bortkewitsch and Mortara with ,/7—1 for ,// is of no importance. 


now E’ 


E’ = 67449 


x 


64 On the Poisson Law of Small Numbers 


the nature of those contributory causes groups. Of course if their dependence 
were of the nature of successive draws from a pack, then the result would be 
a hypergeometrical series and Q would have no physical meaning for the series 
at all. 


(11) We will deal with one further illustration out of many considered by 
Mortara which are of like character. In the case of Marriages of Uncle and Niece 
(see Table XIV, p. 65), where the distribution of Q’s is the most favourable 
for his theory, the binomials are 


Reggio Marche 10( ‘7000 + ‘3000)'” 


Umbria 10( ‘9000 + 

Basilicata 10(1°4000 — 

Sardegna 10( °44545 + 
Emilia 10( ‘9818 + -0182)2°1 
Abruzzi 10( ‘8429 + 
Lazio 10 (12548 — 
Puglie 10(1°5111 — 
Veneto 10(1°3444 — 
Toscana 10 (2'2667 —1:2667)-*" 
Calabria 10 (13584 — 


of which only one (Emilia) approaches the conditions for an exponential distribu- 
tion. If we test the totals at the foot of Table XIV, we find the result much to the 
advantage of the binomial, for which P = ‘902 as against ‘714 for the exponential. 


(12) On Mortara’s own showing nearly all the Qs of his numerous series are 
greater than unity, and very few of the binomials are positive. If we consider the 
distribution of Q’s, given in his work omitting Table 13 (Deaths from Malaria) we 
find a range from °5 to 3°6 with a mean Q at 


1:2565 + 0847, 


while for the distribution of all the p’s in the binomials we have determined, we 
find a range from ‘4 to 15°6 with a mean p at 2°5655 + ‘3817. 


These results are sufficient to show that there is no real distribution of p round 
the value unity but the binomials have a distinct tendency to be negative. 


(18) But the whole theory of Poisson’s exponential law in the hands of Bortke- 
witsch and Mortara appears essentially vague. The binomial is built up on the 
assumption of the repetition n times of a number of independent events, of which 
the chance of occurrence is identical and equal to g. The population is m and the 
chance of occurrence g in the case of each individual. The mean frequency of 
occurrence is ng. But if q be very small we have seen that the series is 

m  m 


cm (1 + 31 + 


Umbri: 


| Basilic 
| 


| Sardeg 
Emilia 


Abruz: 
| 
| Lazio 


| Puglie 
Venet 
Tosca 


Calabi 


Total 


| 
| Marche 
| 
| 
| | 
| } 


Lucy WHITAKER 65 


TABLE XIV. 
Marriages of Uncle and Niece (1900—1909). 


| | | over 
| Marche 0. 3 | 
M.| 7-41] 2:22] -04 
B. 7 3 - _ | | | | | | 
| 
| Umbria O.| 6 3 1 
| M | 6-:06| 303| -13| “02 
B.| 5°90] 3°28) -73; — | | ee 
M | 5°49| 3-29] -99| -03 
B.| 6°04] 259} -31/ -10| | 
| Sardegna O. 2 5 | 3 | | 
M.| 3°33) 366 201) ‘74, “Ol 
B.| 201 496) 303) — | — | — | | 
| 
Emilia 1 3 2 2 1 
M.| 1°11| 2°44 2°68] 1°97) 1°08} -48| -18| | 
B. | 1:09) 2-43 2-70/ 1°98 1-08) -17| 01 | 
| | | 
Abruzzi O. | 3 1 3 2 | — | 1 — 
M.| 61) 1-70 238/ 223 156) -41/ -02| 
| B. 1°58) 2-48] 2°43) 1°68/ 87) 34) — | 
| 
| Lazio 0. 1 1 2 3 _ | 2 | 1 | —}|—} | | 
| M.| 1 2°17| 2°24) 1°73/1-07; °55| -25| -10| -O1 | 
B.| 1:56 2°09) 200 1°54 | 1-01 | 31 | 14) -06| -03| 
| | | | 
M.| -98| 1:77! 2°13, 1:91/1:38| -83/ -42| -19| -O1| — | 
B. | 55) 1:30| 1°77| 1°80) 1°53/ 1°14] -77 29| -16| +10 05 | -03 | 01 ‘Ol 
| | | | | 
M.| 1°13] 1°69| 1:90/1°71/1-28| -82| -46| -23| -10| -04| -02| 01) — 
B.| 21| -70| 1°26) 1°62) 1-67/1-46/1-13| -79| -51| -30 09 | -05 | 02 -01| -O1 
| 
Toscana O. | — 1 | 2 1/12} | | 
M.| 66) 1°19/ -81| -49| -13| -06| 02 -o1| — 
31) -73| 1-07| 1-25) 127/117] 1-01} -50| -27/ -09| -06| -10 
Calabria O. | — — | -- 2 2 1 1 1 ee 1 
M.| — 01} 05) 16) | 1-20] 1-33 / 1°32] 1-17) +95 | -48 | | -20 | 
B.| 00; -96/1°16| 1-21 | 1°17] 1-04} 69 | 37 | -25| -16 | 
| 
M, | 24°88 | 19°47 | 14-93 | 12°72 | 10°39 | 7°93 | 5-76 | 4°10 | 2°96 | 2-17 | 1°57| 1°13 | 78-51 | 32 -18| +20 
B, | 24°22 | 22-16 | 16-46 | 11-73 | 9°35 | 6-88 | 4-98 | 3-74 | 2-84 | 2-19 | 1-71 | 1-29 | -97 pa Ps Ne 26 
| 


Biometrika x 9 


4 
} 


66 On the Poisson Law of Snall Numbers 


from which n has disappeared, and in this exponential we have seen that 
Bortkewitsch and Mortara suppose m small, ic. 10 or under. We have seen 
that there is no reason why m should be absolutely small, and that the name 
given by Bortkewitsch to the Poisson-Exponential—ie. the “Law of Small 
Numbers ”—is misleading. But supposing the mean occurrence m to be small, 
it by no means follows that q need be small and z finite. For if g="2 and n=4, 
m would be “small ”—and the sort of small number with which our authors deal, 
but the mere fact that the mean frequency of occurrence was 2 would not justify 
our using the Poisson-Exponential for 
(8 + 

The fact is that when our authors speak of the deaths in a Prussian Army 
corps from the kick of a horse, or the suicides of schoolgirls, or the deaths from 
chronic alcoholism as being “small,” they really mean small as compared with the 
number of persons exposed to risk. They had probably in mind all the men in 
the army corps, all school-girls or all individuals liable to death in the towns 
considered. But are all men in the army corps,—or only the cavalry, the artillery, 
etc.,—equally liable to death from the kick of a horse? Is every school-girl equally 
liable to commit suicide or only a very few morbid and unhealthy minded girls? 
Is every individual equally liable to die of chronic alcoholism, or only perhaps the 
10 or 12 confirmed and aged drunkards in a town? The moment we realise these 
doubts, what is the population n to be considered? It is not m being small, but 
the smallness of m/n that leads us to believe that the binomial may have passed into 
an exponential. But if only six school-girls per year in a community are in the 
least likely to commit suicide, what is the justification for the “law of small 
numbers,” if the average number of suicides be 65? Further, if we pass to even 
a large community in which the tendency to commit suicide is graded—a very 
probable state of affairs—m might be small and n large, and yet since qg is not 
constant, the binomial and its exponential limit would not be applicable ; and this 
non-applicability would not depend on “lumpiness”—i.e. contagion or example in 
occurrence. Thus the probability might be: 


(Pi + (Ps qs) (ps qs) oe (Pn qn) 
with all the p’s independent (as in spinning differently divided teetotums) and not 
correlated (as they would be in drawing successive non-returned cards from a pack). 
It would seem therefore that a priori we should not expect the conditions for the 
exponential to be fulfilled in most of the cases selected by Bortkewitsch and 


Mortara, although with perfect mixing we might expect it in the cases cited 
by “Student.” 


(i4) In order to test this point on adequate numbers, the ages at death of all 
persons dying over 70 years of age were extracted for a period of three complete 
years from the notices of death in the Times newspaper for the years 1910—1912: 
see Table XV. These announcements of death are those of individuals in a fairly 
limited class, which may be considered stable in numbers for these three years. 


f 
| 
| 
| 


G. 8F- 1 l er 
| Z9-Z 96-9 id OL 
OF- I LI-FL 6 
€ 66-81 Lz 8 = 
if 9F-¢ 88-6 8 FF 4 
18-@ LI L1G LL-G8 68-€8 148 9 
IL-OL 19 CL-OFl | g = 
GP-F L8-GE “4 Fe-F TLL LLL Fg. LOG | 691 
1€-¢L 02-91. LI ZL-€6 68 FO-Z1@ 6F-26L 8L- | | & 
L@-9FE 88-0¢E OGE 0G-ELE 9Le LE-ELZ LE-FLE L9G IL-FIL | OFT I 
| 
| | see | 1 
| 88-€ 98-¢ | F Or 
| 9¢-01 | | 61 6 
89-1 | 90-€ 99.¢¢ 69 
| | 18-6 IL 19-02 -| 86-62 O€-Z1Z | 60-661 
G8- FF 69-CF cr SI-cOT | | 8-922 | 89-L1Z 9FG & 
6F- GE 88 0@-€91 1 18-682 | OLTLG | 692 8-181 O€-FS8L | 
O!-08% | 91-968 | 16€ 66-86 | 11-26 0¢-LOL 
GZ-8Z8 10-628 | 38-08F | 9Z-861 | GEE | PFGE | 0 
-ueu0dx P 90 -uauodxa | } peasasqO Xa) | P 90 warp 10d 
jo 


-wadvdsmau sowry, paby ay fo hop vad syynaqT 
‘AX ATAVL 


| 
| 
id 
| 
| 
| 
| 


68 On the Poisson Law of Small Numbers 


Table XVI shows that the announcements of deaths over 70 years of age only 
amount to 3°74 per day for males and 3°52 for females. These are certainly “small 
numbers,” but “small” with regard to what? Are we to consider n as the number 
of the population which embraces, (i) all the individuals of the limited classes of 
the same range of ages as the defunct, (ii) all the individuals announced as dead 
on the same day, (iii) all the individuals of whatever ages of the class which 
announces deaths in the Times? Or, should we refer to all the individuals in the 
community of that range of ages, or the whole community at large, ie. the chance 
that in a population of so many millions an individual over 70 or 80 as the case 
may be will die and have their death announced in the Times newspaper? Well, 
it really does not matter, because if for any one or all of these populations the 


binomial (p + q)" applied, we should get if g were small and n large, the Poisson 
series 


2 3 

and this quite regardless of the size of n. If therefore we did find a series in 
which q was very small and n large, we might not be able to say to which, if any 
of the above populations n applied. On the other hand the mere fact that m is 
small is no justification for the use of the “law of small numbers” as is sometimes 
implied. If it be argued that the small number of people who die over 80 and 
have their names recorded in the Times are drawn from a small population, we 
reply so it may be argued are the school children who commit suicide, the uncles 
who feel any inclination to marry their nieces, or the men liable to die of chronic 
alcoholism ; and we can in the case of the announcement of deaths test the values 
of q and n on fairly adequate numbers. As a matter of fact we do not know, in 
attempting to apply the Poisson formula, what is the population from which we 
are drawing our individuals, and the justification of the Poisson formula lies only 
in showing that there actually does exist a binomial for which q is small and 
n large. We might imagine that as we got to the higher ages practically every 
person of that age would die, or that in our notation g would be 1 nearly and p be 
a very small quantity ; thus an approach might be made to the Poisson-Exponential. 
But the approach to the Poisson-Exponential arises not through q approaching 
unity but from gq becoming very small. Nor again in the lower age groups do we 
find ourselves left with a positive binomial. 


In all cases except women over 90 years of age, we find that a negative 
binomial best fits the observations. Even in the case of the announcements of 
deaths of women over 90 years, we find that the approach of the binomial to the 
Poisson exponential depends on 


1 53°3333 
( 553555) 


being measured with sufficient approximation by e = 2°71828. But 
= 2°69323, 


Lucy WHITAKER 69 


and is therefore not a very close approximation, a result shown when we use 
a binomial by the substantial improvement in the measure P of “goodness of 
fit.” Even in this case we are not prepared to say what is the population for 
which the q = ‘01875 in the case of these announcements of deaths of women over 
90 years of age. It can scarcely be that there are only 29 women over 90 years 


TABLE XVI. 
Constants for Deaths of Aged. 
Men. 
| 
| | | Probable | Probable a ial Expo- 
| Age over p q Error n Error nential 
| of q ofn | 
aes | | | 
| 70 years... | 1712965 12965 | + °03314 |—28°8747 + 7°3784 3°7436 °1355 0045 
| 80 years... | 1°12152 | — 12152 | + ‘03349 |-14°0703 + 3°8704 1°7099, °9358 1129 
85 years... 1701903 | — -01903 | + ‘02902 | — 43-2996 + 67°5797 | °8239 | ‘9737 
| 90 years 1°00654 | — 00654 °02934 | — 42°8498 +192°3069 | -2801 “6741 “6672 
Women. 
| | | Probabl | Probable | | 
Probable | Probable | xpo- 
Age over Error | n Error m nential | 
| | of q of n | | 
| | | 
| 70 years... 134012 | ---34012 + 04161 — 10°3522 | + 1°2307 | 3°5210 =8084 | -0000 
80 years... 1°20770 | - °20770 | + 03294 —10°4400} + 1°8309 271569) ‘9686 | -0018 
35 years... | 1°14507 | —°14507 +°03077 |— 8°1447| + 1°9627 171816; 1062 
90 years... | °98125 | +°01875 | + 02779 | +29°0873 +43°0634 | 5447) “9848 | ‘8116 
} | 


of age living in the country, whose deaths are likely to be announced in the Times 
when they occur. Further the probable error of g is such that actually this case 
might equally well be a random sample from material following a negative 
binomia!. Analysing our material we see that our first two cases of males and 
the first three of females are such that they could not possibly be random samples 
from positive binomials, the probable errors of qg are too small. Next, seven cases 
out of the eight do give actually negative binomials and the eighth might, having 
regard to its probable errors, well be a negative binomial. Thus although our 
daily occurrences are certainly in Bortkewitsch and Mortara’s sense “ small numbers,” 
they give no support to the use of a Poisson-Exponential. 


If it be said that these “small numbers” differ in character from those used 
by our authors, the reply must be: we know in none of these cases the real 
population from which deaths are to be considered as drawn. The chances of 
death are certainly graduated with age, but the chances of suicide are graduated 
with temperament, and the same is true of alcoholism, or again the chance of 


iF 


70 On the Poisson Law of Small Numbers 


death by accident is graduated with occupation. At any rate until those who 
support the use of the “law of small numbers” demonstrate its application on 
material, where the probable errors are sufficiently small for us to measure the true 
value of g and n, no advance can be made. Nor until we have clear ideas of the 
population n in which the chance is gq, is it possible to assert that it may be used 
for the suicides of school children, and the marriage of uncle and niece, and must 
not be used for the deaths of aged people, which certainly occur in “smaller” 
numbers. 


In the illustrations of deaths we have taken, certainly the Poisson-Exponential 
is not the rule, although the distributions appear to approach it, as towards a limit, 
when the number of deaths approach zero. But our data which show the rule of 
the negative binomial appear to show it in no more marked manner than much of 
the data selected by Mortara himself indicate the negative binomial, although owing 
to the sparsity of his material his results are far more erratic and unreliable. Nor 
is Bortkewitsch much behind Mortara in the evidence he produces for a negative 
binomial being as reasonable a description—possibly owing to inherent lumpiness— 
as a positive binomial of these “small number ” frequencies. 


(15) Conclusions. 


(a) The Poisson-Exponential gives a fairly reasonable method of dealing with 
the probable deviations of small sub-frequencies in the case of random sampling. 
When the average value of a sub-frequency is not more than 3°/, of a population, 
then Poisson’s formula suffices in most practical cases to determine the range of 
error likely to be made. Tables are given to assist its use. 


(b) The application of the Poisson-Exponential to various data by Bortkewitsch 
and Mortara has hardly been justified by those writers, for they have not tested 
whether the probability q is small and positive and the power n large and positive 
in the cases considered by them. When this is actually done, it is found that 
their hypotheses, having regard to the probable errors of q and n, are largely 
unjustified in the case of their illustrations. Even in such cases where it is 


justified, a binomial gives a better result as measured by the test for goodness 
of fit. 


(c) Negative binomials repeatedly occur and give just as good fits, where 
they occur, as positive binomials. In the illustrations taken by Mortara, the 
frequency 10 used is so small that it is not possible to assert that either positive 
or negative binomials are demanded by the data. Still the average p of his results 
is very significantly in excess of unity. 


(2d) Mortara like Bortkewitsch cuts out of his data straight off all districts 
with, on the average, more than 10 cases in the year. But the q obtained from 
20, 40, or even 100 cases in a population of 100,000 is a small g in the sense that 
the resulting binomial is adequately expressed by a Poisson-Exponential. There 


| 
i 
| 
f 
f 


Lucy WHITAKER 71 


appears to be no valid reason for such a procedure, except the experience that 
many such cases actually give negative binomials*. It seems to us theoretically 
unjustifiable to apply the exponential to 8 cases say in a district of 100,000, and 
not apply it to 12 cases in a district of 200,000. Actually p may be 14 in the 
first case and only 0°9 in the second. 


(e) We consider that the reasonable method in every case is not to start with 
the Poisson-Exponentiai, which screens the truth or falsity of the a priori 
hypotheses, but to fit a binomial regardless of the magnitude of p. The fact that 
quite as good fits are obtained with negative as with positive binomials suggests 
that a new interpretation of these cases of “negative probability” is requisite. 
Several cases of the interrelation of “contributory cause groups” which provide 
a series represented by a negative binomial (p—q)~" have been recognised +. 
A general interpretation based on a very simple conception seems needed for 
these demographic cases in which the law of small numbers appears far more often 
to correspond to a negative than to a positive binomial. 


This paper was worked out in the Biometric Laboratory, and I have to thank 
Professor Karl Pearson for his aid at various stages. 


* Can we cite in addition perhaps, the fact that existing tables of m*e~”/x! do not extend beyond 
m=10? 
+ Pearson, Biometrika, Vol. tv. p. 208. 


5 
. 


THE RELATIONSHIP BETWEEN THE WEIGHT OF THE 
SEED PLANTED AND THE CHARACTERISTICS OF 
THE PLANT PRODUCED. II. 


By J. ARTHUR HARRIS, Ph.D., Carnegie Institution of Washington, U.S.A. 


J. Inrropuctory REMARKS. 


1. In Biometrika, Vol. 1x. pp. 11—21, March 1913, were published constants 
showing the relationship between the weight of the seed planted and the number 
of pods on the plants produced in twenty experimentally grown series of Phaseolus 
vulgaris. From the economic view point, number of pods is the most important 
character which could have been chosen, total weight of seed matured only 
excepted. But to the student of morphogenesis, or of the physiology of seed 
production, other characters are of equal interest, while the comparison of the 
correlations for various features must yield results of significance. 


The purpose of the present communication is the presentation of the constants 
measuring the influence of the weight of the seed planted upon the number of 
ovules formed and the number of seeds developing in the pods of the matured 
plant. 

These various relationships have now been worked out for a relatively large 
bulk of material. Altogether there are 29 individual series belonging to 5 
varieties, involving 17,953 plants, from which 119,192 determinations of the 
number of ovules and seeds per pod have been made. The reply to the possible 
suggestion that the expenditure of effort in the collection and analysis of such 
masses of data is quite unjustifiable is twofold. First, a major portion of the | 
labour involved was necessary for investigations not touched upon here. Secondly, 
there are many problems of morphogenesis and physiology which can only be 
solved by the amassing of large series of accurately determined biometric constants 
which when sufficiently numerous may themselves be the materials for statistical 
analysis. The data here contained are recorded in partial fulfilment of such 
requirements for certain definite morphological and physiological problems. 

The present paper is limited strictly to matters of fact ; general discussions are 
reserved until further data—much of which is already available in a raw state— 
are reduced. 


‘ 
4 
4 
ig 
‘ 


J. A. Harris 73 


II. MATERIALS. 


The first paper may be consulted for details not entered here. The data 
analysed are drawn in part from the series already considered for the relationship 
between weight planted and number of pods produced. In addition to the White 
Flageolet, Navy and Ne Plus Ultra varieties already treated, several lots of 
Burpee’s Stringless and two of Golden Wax are available. 


III. ANAtysis oF Data. 


2. Data for Number of Ovules and Seeds per Pod. 


Tables III—VI, similar to those of the preceding paper, give in a condensed 
form the data for the correlations discussed. Table I* gives the correlations 


TABLE I. Correlation and Partial Correlation Coefficients. 


| | 

| Correlation, Correlation, Partial Correlation, Partial 

Series of Weight of Pods Weight | Correlation Weight | Correlation, | 

| and Pods | and Ovules | and Seeds 

Twp Two Tws 
LL 1141 | --008+ 8043 026+°008| -027+°008 —°013+°008 | — 013+ 008 
LG 182 “066 +050, 806 1534-023} —-100+ |—-103 + 024 | 

| GG 750 |-°368+°021| 6310 018+°008' -029+°008 004+°008| -016+°008 

| GGH 583 5251 045+°010| °019+°009 004+ 009 

| GGH2 499 "176+ 029 3502 093+°011; °083+°011 063+°011| 049+-011 

'GGHH 396 "193 + 033 2656 — "022+ °013 -—-042+°013 — 029+ 013 |— 048+ 013 
GGD 514 "159 + 039 1438 107+ °018 ‘089+°018 071+°018| ‘068+-018 
GGD2 449 | +030 | 1227 044+ 019, 018+°019 ‘062+°019 
GGDD | 342 | 137+°036| 807 101+ ‘092+°024, ‘089+°024| 076+ ‘024 
HH 1484 | ‘177+°017; 14029 010+ |—"039+°006 ‘007+°006 |—-054+ 006 
HHH 1271 "145+ °019 11230 — ‘000 + ‘006 | — 030 + 006 016 | —°014+°006 
HD 1416 "129+ 018 | 5581 — 044+ °009 | —-067+°009 — -049+ 009 | — -052+-009 
HDD 1204 121+°019| 5449 — 029 + 009 | — 065+ — |—-030+ -009 
DD 513 + ‘027 1827 098+°016) °009+°016 050+°016| ‘008+°016 
DDD 459 *215 2018 044+°015; 000+°015! °046+4°015 006 + 
DH 670 + | 5955 075 + 009 |— -005+°009 ‘076+°009 | — 013+ ‘009 

DHH 565 + 028 5019 045+°010| ‘008+°010 011+ °010 | — 025+ 
USC 530 150+°029; 2569 059+°013; °032+°013) 024+°013 
USS 680 155+°025| 6605 023 + 008 |— 000+ 008} °024+ 008 

| OSH 361 + 035 3406 032+°012; 001+°012, °020+°012 
USHH 224 143+ 044 1743 “112+ 016 | 098+ °011+°016 |— 004+ 016 

| USD 312 195 + °037 | 802 "127+ °023) ‘098+°024) 071+°024| ‘067+ 

| USDD 237 241 +041 | 851 +°022| -090+°023 

| FSC 586 + 027 2876 047+°013! ‘089+°012} -073+°013 

FSS 868 098+ 023; 7809 021+°008; °001+°008| 

| FSH 475 + 4541 °049+°010) 018+°010 —*045+°010 |-—°073+°010 
FSHH 427 “121+ °032 | 3837 015 +°011 |— 011 040+ °011 O17+ 011 
FSD 428 130 + 032 | 1449 060 + ‘018 | — 027+ — | — ‘036+ 018 
FSDD 387 144+ 034 1556 037+°017| °013+°017 047+ 024+ °017 

| | | 


* The weight of the seed planted was weighted with the number of pods counted. Thus w and o, 
differ slightly from those of Table II of the first paper. Sheppard’s correction was used for seed 
weight, but not for the integral variates ovules per pod or seeds per pod. 


Biometrika x 10 


> 
| 
| 


74 Weight of Seed and Characteristics of Plant 


between weight of seed planted and ovules per pod, ry, and between weight 
planted and number of seeds matured per pod, ys. The partial correlation 
coefficients, 

__Two "wp Tws Twp ps 


V1 — V1 — 


p! ‘wo = 


showing the correlation for weight (w) and ovules (0) and weight and seeds (s) for 
constant numbers of pods (p) per plant are also given. These require in addition 
to the correlations here given rp, 7p. and 1p, the correlations between the number 
of pods per plant and the number of ovules and seeds in these pods. Values 
of ry, are available from the preceding paper (Biometrika, Vol. 1x. p. 21, Table 
VII) and from a supplementary table giving nine additional constants*. For the 
reader’s convenience these are reprinted in this table. The values of 7, and rp, 
will be published in connection with another problem. 


The probable errors have all been calculated on the basis of the number of 
pods examined as NV. There is considerable question whether the actual number 
of seeds planted should not have been used instead; the degree of trustworthiness 
of a constant is perhaps not greater than is indicated by the lowest number of 
actual measurements (irrespective of the number of associated measures taken). 
The point is not of the greatest practical importance for the present case, since the 
number of series is so large that conclusions can be drawn from the run of the 
constants as a whole and too much weight need not be given to individual series. 


A glance at the table shows that the correlations are low throughout. The 
suggestion naturally arises that some of the extremely low values may be due to 
non-linear regression. The regression straight line equations and the results of 
Blakeman’s test+ are given in Table II. Here r, and the straight line equation 
for the regression of ovules and seeds per pod on weight planted (in working units) 
are determined by the conventional formulae. The final two columns give the 
values of 


1 ) 
when €= 7? — and y, = 67449//N. 


All the straight lines are shown in Diagram 1. The empirical means are 
indicated in all of the cases where it can be done without confusion. The slope is 
very slight and the agreement of observed and predicted means not very close, 
especially near the ends of the range, where the number of observations is small. 
There is, however, no clear indication that a curve of a higher order would describe 
the results better than a straight line. This irregularity is precisely what is to be 
expected in cases of low correlation. 


* Harris, J. Arthur, ‘‘ An Illustration of the Influence of Substratum Heterogeneity upon Experi- 
mental Results.” Science, N. 8. Vol. xxxvur. pp. 345—346, 1913. 
+ Blakeman, J., Biometrika, Vol. 1v. pp. 3832—350, 1905. 


| 
| 
; 
J 


J. A. Harris 


TABLE II. 
Tests for Linearity of Regression. 


Series 


Correlation, r, 


an 
Probable Error 


Correlation Ratio, 


an 
Probable Error 


Regression 
Straight Line 
Equation 


Blakeman’s 
Criterion, 
Test A 


For Ovules: 
USS 
DHH 
USDD 
GGD2 
FSS 
HH 


For Seeds: 
USS 
DHH 
USDD 
GGD2 
FSS 


"0232 + ‘0083 
+ ‘0095 
-2381 + ‘0218 
“0442 + 0192 
"0209 + 0076 
“0098 + 0057 


+ 
0111+ 0095 
1313 + °0227 
0794+ ‘0191 
+ ‘0076 


0657 + ‘0083 
0788 + 0095 
+ 0211 
1276+ 0189 
+ ‘0076 
0661 + ‘0057 


0946 + -0082 
0541 + °0095 
"1932 + *0223 
*1760 + ‘0187 
0499 + ‘0076 


5°4230 + w 
4°9385 + °0257 w | 
3°6886 +°LOO1 w | 


4°7224+ 0137 wv 


5°3600+ w | 
| 


3°5870 + 0206 w 
4:1521+°0106 w 
2°1840 +- 0940 2 | 
2°4735 + 0346 w | 
3°0712 + w 


Blakeman’s 
Criterion, 
Test B 


1°688 
1151 
2°102 
1°967 

“754 
1°678 


2°351 
1°869 
1°650 
2°529 

931 


HH + 42119 +0058 w | 2°739 


0953 + °0057 


Blakeman’s criterion has been applied in two ways, A and B. In the first the 
actual number of pods examined has been taken as V. In test B the number of 
seeds planted (not the weighted number) has been used in obtaining x,. If the 
first test be accepted as the proper one, it follows that regression cannot safely be 
regarded as linear. But there are two important points to be taken into account. 

_The correlation ratio 7 depends upon the squares of the differences in means, hence 
it has always a positive value, which may be very substantial because of the errors 
of sampling when the number of individuals per array is small. Thus when r 
approaches zero 7 is limited by 7, the mean values of » for zero correlation*. 
Hence a test for linearity based on a comparison of 9 with a very low value of .r 
may be misleading. Again, as pointed out above, the significance of both r and 
should perhaps be tested on the basis of the lowest number of measurements. If 
this be done, as it is in test B, there is found very little evidence for non-linear 
regression. Certainly, one cannot possibly assert that: the low values of r, which 
is seen throughout these experiments, is due to the number of ovules (seeds) per 
pod at first becoming larger and then decreasing after a maximum is reached as 
one passes from the lowest to the highest grade of seed weight. 


The results of Table I are also shown graphically in Diagram 2. Here the 
relationships for weight of seed planted and number of pods on the plant developing 
are also indicated as a basis of comparison. The values of both 7». and 7» are in 
general conspicuously lower than the low values of rp. But very few of them 
drop below the zero bar; one is forced to the conclusion that there is a distinct 
though very slight correlation between weight and ovules and between weight 


and seeds. 
* See K. Pearson, Biometrika, Vol. vit. pp. 254—256, 1911. 
10—2 


75 
| 
| | | | 
| 

3-720 
} 3°431 
11-096 
| 83°152 
} 2°263 
| 5159 
| | 5°182 | 
| 8-712 
4°181 
| | 2-793 
| 


76 Weight of Seed and Characteristics of Plant 


Consider in somewhat greater detail the sigas and magnitudes of these 
correlations*, 

Of the 26 values of 1, only 4 are negative. The mean value of the 22 positive 
coefficients is + ‘0673; the mean of the 4 negative is —-0236; the mean of all 
(regarding signs) is +0533. 

For the relationship between weights of seed planted and number of seed 
matured per pod, 7r,, 21 constants are positive and 5 are negative. The mean of 
the positive coefficients is +0502; the mean of the negative values is — ‘0303 ; 
for all 26 correlations the mean (regarding signs) is + ‘0348. 

Thus both correlations are (as is clear from the diagrams) unquestionably 
positive but very low. 

Apparently the relationship for weight and ovules is slightly closer than that 
for weight and seeds per pod, but the difference is too slight to justify any final 
conclusion. 

Consider now the question whether the observed correlations ro, Ts, are to be 
regarded as direct biological relationships between the two variables w and o or w 
and s, or whether they are to be looked upon as merely necessary resultants of 
other interdependences. At present, the only other demonstrated correlation 
which might tend to bring about sensible values of r,. and ry; is that between 
number of pods per plant and number of ovules formed and number of seeds 
developing per pod. Since number of pods per plant is known to be correlated 
with weight of seed planted, while both number of ovules and number of seeds per 
pod are correlated with number of pods per plant, some correlation must be 
expected between weight planted and number of ovules and seeds per pod. If 
now the observed values of 1. and 7, which are always small, are merely the 
necessary resultant of the relationships ry»), po, ps, one would expect the partial 
correlation coefficients, p?wo, pws, to be sensibly zero. If these partial correlations 
are not sensibly zero, it can only mean that there is a direct (causal) relationship 


other than the one just considered between number of ovules (or seeds) and the 
weight of the seed planted. 


The partial correlations and the correlations are shown side by side in 
Diagrams 3 and 4. The lowering of the degrae of interdependence between both 
weight and ovules and weight and seeds by the correction for number of pods per 
plant is clearly marked. In a number of cases in which the correlation coefficient 
is positive the partial correlation coefficient is negative. 


Thus only 4 of the 26 values of r,,. are negative, while 9 of the partial cor- 
‘relation coefficients have the minus sign. In only 5 cases is ry, negative, but in 
11 of the series, the sign of yr». is negative. The mean values of the partial cor- 
relations are very close indeed to zero. Thus p="0186 as compared with 
Two = 0533; pFws= "0099 as against 7, = 0348. 

* I have already shown (Science, N. S. Vol. xxxvut. pp. 345—346, 1913) that the LL, LG and GG 
series are open to question because of the lack of certain precautions in the cultures; while they are 


included ia the table of fundamental constants to avoid any possible criticism of selection of series they 
will be left out of account in the following discussions. 


Z 
< 
f 
‘ 


. 
é 
' 
| 
| 
4 


‘spun Burysom ur payunjd spaas fo 


86 13 Bi AL OL Gt Gt It Ol 6 8 y 4 
T T T T T T T T T T T T T 
= 
= 
§ 
< 
= 


4 


*spees IO} JOMOT XIs OY} Jeddn poss jo uo pod sed speos yo puv pod sajnao jo NVUOVIG 


Weight of Seed and: Characteristics of Plant 


78 


! 

1 | 1 1 


co-— 


OL + 


03: + 


fo sanjv4 


4 
| 
w 
+ + 


1 a 6 @ 
& 
So = & 


som soul pros seull 
zed spod jo sequinu pod aad sejnao puv poss Jo JO Jo NVUNVIG 


| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 


Weight of Seed and Characteristics of Plant 


80 


= Ss S TET S = 
R 1 1 00: 
\ 
\ 
% 
1 ! ! ‘i 


ted spod jo sequinu oy} pod zed spoes jo Joquinu pees Jo SIEM Jo JO KVUOVIG 


*squaingaoo fo anjv4 


A 


J. A. Harris 


js | |e |¢ |e 6 | tex | | 
—|—|—|se |e /s |e |e |e |v |16 |et | zo | tt | tor | zor oe | sez | | 
—|—!—J/er | | | — | — | | oor | or | | oF | | | 
— | — | —| 96 | ser joe | 69 | esr |e | ert | ser | | oot | ost | | | cor | | 
| | 69 | ort | co | ott | | ost tee | | | GOE | | FOZ | | ORF | £89 
| et | | | ge | tie | seo | Lor] | | | | | GOLT | TOE | | | FEL 
eg | tr | | | ace | 969 | | | | | | | | 698% | | I8L 
| | | eee | | Gor | | | sat | | | | GIST | OFGT | ZOE | | FECL | LFIE | GIS 
| we | 2¢ | veo | ee | 6st | | | O81 | | | I99T | | EOF | | | FLE 
| | ¥9 | | | ost | | | | 6106 | | | | | | cege | | | SIF 
ore | LI¢ | 901 | | | oct | Foo | 006 | SBT | | Lee | | | | IZL | 
ee | | sat | ive | | got | zor | coz | | Teor | | | FOP Foes | FETE | | LAT 
ter | 612 | | gee | cor | | eer | cot | eee | 968 BLT | | GIST | Ger | 806 | FET 
ove | att | ste | | oz | | SIF | | | Gee GOL | | ZEIT Ges | | COAL OGL 
ges | oe | | | 96 09 | 906 | | | ere | | | gor | | 122/99 | 
tor | | 9¢.| | ett | | ze | | — | — | — 218 | we | zo | 198 | cet ios | ze |8 
e |e (st | eo | re jet | w | | — | — | — | | | 19 | | ee jes | at 
or | se | | | et |ve |e |e lat |oo |so |i |w 
| ies | st — | — | — | ert | 908 | | OOT | | — |— 
— | — | =o = — — — — | — — — — — 
| | | | 
speag spoeg | | speeg | speeg spoog | speog 
| 


(8) $¢8-—008- 
(88) 008-—-GLL+ 
(18) GLL-—OGL+ 
(08) 
(66) S@L-—O0L- 
(88) 00L:-—GL9- 
GL9-—0G9- 
(9%) 099--—G69- 
(G8) S¢9-—009: 
009-—EL9- 
(88) 
(68) 09¢-—969. 
(0@) 
(61) 
(81) 0S%-— get. 
(41) 
(91) 00%-—¢Le- 
(1) $L8-—098- 
(41) 
($1) 9¢6-—008- 
(61) 008-—¢L6- 
(IT) —096- 
(on) 
(6) $66-—008- 
00¢-—AZI- 


(4) 
(9) OGI-—9éT- 
(¢) 
(%) 00T-—$¢Z0- 


(¢) 240-—-090. 


peg 
jo 


11 


Biometrika x 


TL ATAVL 


q 
| i 
| 
| | 
| | | 
| 
| 1 : 
| 
| | 
q 
| | 
{ 
| 
| 


Weight of Seed and Characteristics of Plant 


82 


‘A 


9 86 | 9 | | 
om | | oo. — | — | | 
| St Size — | — | 
S61 | O1€ | 9886 | — | — | — 
| | | | — | — | — 
I91 | | LL | | 488 | — 
€96 | €09 | GZI 10c¢| 6249 | FL | 9% | ¢ 
| LOO FEL | | OS | IZI | 
| BIE | 99 | | LET | 98% | 
| COF 48 | IF9T| | | | 6 
691 | 90€ | 49 | OOF | Gee | LPL | GFT 
19 | 96 |€29 | | 269 | STOT | 9Iz 
1€ | 2 | 162 | | Log | | O9T 
Ir | |¢ | €29 | | LOF | 
or |} 1 |¢ |ze I¢. | #8 | 
Spaeg | | speeg 
| | rm oz, | | | | “POF 


— — |— 
| | Te Ig OL 
68 | | | 691 | Te 
981 | 86 | | | FF 
96 ZL. | 98 | | | 
SIT | 086 | Lb | | FEIT | STZ 
089 | | OLIL | | 062 
rer | SOL | I9T | SLT | | 
Lee | | GET | FOST | LEOT | 6TE 
986 | | STL | 449 | 186 | SLT 
6L | | 889 | | 291 
FOL | PEL | 
6@ | Fg Il = 
| 

| | Toe | | 


HHS? 809g 


ol 
op | Or 
| Ol 
ore | OIL 
| | OL 
| 109 6II 
11¢ | 219 | 
| FEL | 
FOOT | 
962% | 18¢ 
ZLOZ | 
| 968 
| Z9BI | 19% 
OLE | | &6 
ee | or 
ge Fo COL 
6e | 1g | ol 
=| |= 
| TO], 
HSA seueg 


| 6FE 
| bee 
| G16 
989T 


O61LF 
LILP 


| 
| 0% 
Gol | 0& | | OL 
sse | 09 | | OL 
9IL | 8ZI | SOT | Ges | 6E 
S06 | O9T | | | BL 
| | 68E | 
686 | | cer | | FOT 
9981 | cee | 489 | 694 | FEL 
FEET | | | FOOL | FEZ 
| ILL | LPPL | SS6L | BFE 
1819 | OZIT | | | 
6899 | LOGI | | | 
G1Z9 | OIL | | LZ9T | 
| | | | 98T 
496 | | 18 | | & 
| | GL | 86 81 
L 
| 
|e 
| = | L1G 
| 
speeg | 
“Pd | woz, | | 
SSA OSA 


(8%) GL9-—089. 
(Té) 
(08) 
(61) 
(91) 
(£1) $6%-—00f. 
(91) 00%-—SL8- 
(GT) GLE-—098- 
(YI) 098-—968- 
($1) 9¢6-—008- 
(61) 008-—$2é- 
(IT) $L6:-—096- 
(OL) 


(6) $¢6-—006: 
(3) 006-—SLI- 
GLI-—OST- 
(9) 
(¢) SéI-—OoT- 
(4) 


peeg 
jo 
AY 


“AL ATAVAL 


= 
| 
& 
| 
jee} 
| | 
3 | | 
3 | 
| 
| 
i 
| 
| | 
| | 


L8ET | GLLT | OSE | | E6FZ FEF | £899 | | FE8F ZOTD 
| | 9L9 | TS8 | OOZT 9E9T | 


SOIT | | | | | | 6I9 

6 


g I 68I | 9c | | | I6 LOOT | O8EL | 9FZ PROT | =| 099 LEG. | (IT) $48-—096- : 
991 | | 1z¢ | GFT | | 966% | OFS | | | CODE | | LLOT | | | (01) 
| 69ZL  €98E | E800T| | BLOF | FOL | (6) 
| | OOTL | Z8ESL| | 99FE | | | (9) 
coe | | OLIF | IGL | | | OBFI | 988s | GLP | (4) 


ZIL_ | | | lor | |9ree | 989 | 908 | | | (9) 


9s¢ | 99F #8 | FID |FSOL | PRT | | | OL | 
| 


| 

| F8L GGT | | OBLI | 68zE | 98IT | 18¢¢ 
SOL | F06 | OF | LOT 8éL | 


| | sr | 69 | OL | | 68 6I | 46 ¢ | | 
oe | | | |9 |) 940-090. 
speeg | speeg speag spaeg speag | 
| wrog, "PT | qwrog, | PCT | | | | prog, | | | rerog, | | | | | 
TA 
Lor} ert |— |, — | — | — | — — jes | Leg (LIT | | 82 | (ST) | 


vee 98 | GOL FEL | SI | | 9FL | S961 | | 
| 998 | PREL | ELET 8668 | 1806 (77) 


6891 | F80Z | GOF | ILF | 88 | OLD OFT | 9LT | | LEFL| | F6E | | 

8009 | G9IT | | Zee | O86I | OZF GEE 691 ZOLE E9L | OBLE| | | | LIOFT | | | COLEL | SZILL| LETE (OL) 
€9¢9 O6CL | ESTE | 8816 | | | ELIT TOST | Gee | LE8h | 6OET | LEPL | 9906 | LLLE (6) | 
£909 | PIFT | O96L| F8L6 | LOGL | | FEE | B9FZ ZOE | | LEL €| | LOOL 8086 | | | SLOFT | LO9G (3) 006-—@LI- 
| CFE | C60L| 8F98 |069T| STL O16 | | O8F | TIS | 966 | 606 | geet 162 | Z6L | ILLE | 088 (4) GLI-—OST- 
| SOT | | EIZE | SFO | 6ST | Ly |LLy | 189 IGT |6TE1| | | | 64S | Lg | | 92 (698 PSII Ky (9) OGI-—GéT. | 
LY Ig OL 948 | 9801 | | 08 | LT | 692 co |S6l | | 46 | /|OL | LOT (¢) | 
—| — | — joes. jor [es je | —| —|—|—| — | — is oor—so| 
soqnag| \spoog | speed | speeg spaeg | | speag | nA speeg | 

HH 800g | Hd dd seg ddH 88109 CH HHH HH 


‘A 


84 Weight of Seed and Characteristics of Plant 


IV. RECAPITULATION. 


The facts presented in this paper and in the preceding studies justify the 
following conclusions. 


1. In Phaseolus vulgaris there is a sensible relationship between the weight 
of the seed planted and the number of pods on the plant developing from it. The 
correlation is always low, averaging only about ‘166, but under proper experimental 
conditions the coefficients have always been found to be positive. When experi- 
ments are not made with all necessary precautions substratum heterogeneity may 
completely obscure the influence of seed weight, reducing the correlation to 
practically zero or even bringing about a substantial negative correlation. 


2. There is also a significant positive correlation between the weight of the 
seed planted and the number of ovules and the number of seeds in the pods pro- 
duced by the plant developing from it. These correlations are so low that on 
relatively small samples negative values may be found. They average only about 
one-fifth to one-third the magnitude of the correlation for weight planted and pods 


per plant. 


The relationship for weight and ovules is numerically higher than that for 
weight and seed, but on the basis of the number of series now available the 
difference cannot be asserted to be significant. 


3. Morphogenetically and physiologically, the observed correlations between 
weight and ovules and weight and seeds are to be regarded as the resultant of two 
other correlations, namely, that between the weight of the seed planted and the 
number of pods per plant and that between the number of pods on the plant and 
the characteristics of these pods. This conclusion is based on the fact that the 
partial correlation coefficient for weight of seed planted and number of ovules or 
seeds per pod for constant number of pods per plant is practically zero. 


Cotp Sprinec Harpor, N.Y. 
August 20, 1913. 


| 
} 


ON THE PROBABILITY THAT TWO INDEPENDENT DIS8- 
TRIBUTIONS OF FREQUENCY ARE REALLY SAMPLES 
OF THE SAME POPULATION, WITH SPECIAL REFER- 
ENCE TO RECENT WORK ON THE IDENTITY OF 
TRYPANOSOME STRAINS 


By KARL PEARSON, 


(1) In Biometrika, Vol. vim. p. 250, I discussed fully the mathematical 
process requisite for measuring the probability that two independent distributions 
of frequency are really samples of the same population. As far as I am aware this 
is the only complete theory of the subject which has been published. I believe it 
to be scientifically adequate, and it has already been applied to a large number of 
problems*. 

Before that paper was published, it had been usual to compare any constants of 
two frequency distributions together, and by a due consideration of their difference 
relative to the combination of their probable errors to determine the probability of 
the identity of those constants. This could be repeated for any number of corre- 
sponding constants, and if theoretical curves of frequency had been fitted, their 
divergence or correspondence measured by the divergence or correspondence of 
their complete series of constants. The method above referred to, however, as 
based on the general theory of sampling, calls for no hypothesis as to the general 
theory of frequency. It takes the observed distributions and measures the prob- 
ability that both are samples from a large population. The population may be 
homogeneous or heterogeneous; provided the samples are truly random samples 
we obtain a measure of the probability of their common origin. 


In the course of a long statistical experience I have learnt that it is wholly 
impossible to reach any safe conclusions as to the identity or non-identity of 
populations by any process of mere graphical comparison of frequency distributions. 


* In actual practice the x? test of ‘‘ goodness of fit” should always be made with not too fine group- 
ing at the terminals, especially when any group in the tails appears to be contributing largely to the 
total of x2. This point was recognised ab initio (Phil. Mag. Vol. u. p. 164), and has recently been 
re-emphasised by Edgworth, Journal R. Statistical Society, Vol. uxxvu. p. 198. 


86 A Study of Trypanosome Strains 


The distributions in appearance are wholly dependent on the choice of scales 
and the eye alone cannot possibly make any measure of the degree of accordance, 
which will have scientific value. 


In the accompanying Diagram I. for example, we have the frequency distri- 
butions permille of two strains of trypanosomes, (aa) from a Donkey and (bb) from 
a Hartebeeste. These we are told are identical. Below (cc) and (dd) are given the 
frequency distributions of head-breadths for two races, Egyptian and English women, 
separated by 7000 years interval. These strains we know to be different, but the 
eye that judges (aa) and (bb) to be the “same” * might well suppose (cc) and (dd) to be 
also the same. Actually when we come to the quantitative measure of divergence, 
the probability that (aa) and (bb) are samples of the same thing is ? < -000,000,1, 
while the probability that (cc) and (dd) are the same is P='001. In other words it 
is 10,000 times as likely that Egyptians of 6000 B.c. and the English of 1680 a.p. 
are the same strain as that the trypanosomes from the Hartebeeste and those 
from the Mzimba Donkey are of the same strain. Both may indeed be of the “same 
strain” if a sufficiently wide meaning be given to the term. But is such a racial 
resemblance as we find between the Prehistoric Egyptian woman and the English 
woman diluted 10,000 times what we understand in ordinary language by the “same 
strain”? All the mathematician can understand by “sameness of strain” is the 
identity which corresponds to random samples of the same population. If the identity 
has been modified by a long evolutionary process, by markedly differential environ- 
ment or treatment, is it not better to have some measure of a scientific nature of 
the extent of the difference or of the sameness? The eye can never provide any 
judgment of value on such a point. Especially is this the case if the graphs 
represent percentages, as the degree of divergence is of course a function of the 
number employed to determine the percentages. A deviation of frequency by per- 
centages based upon samples of 200 might look to the eye absolutely like the 
deviation of frequency due to samples of 2000, but the scientific measure of the 
probability of sameness would be widely modified. 

That the reader should have evidence how excellent is the test, I have taken 
the cranial lengths (Flower’s measurement) of 67 female skulls dug up in Liverpool 
Street and compared them with the like lengths of 142 female skulls dug up in 
Church Lane, Whitechapel. It is possible that both these sets of crania formed 
part of the contents of plague pits, or there may be an interval in date of a century 
between them+. Diagram I bis shows the data arranged as percentage frequency 
curves. The x? for 17 groups proceeding by 2 mm. ranges = 19°38, giving P = ‘250, 
or once in four trials, if the material drawn from were the same, we should obtain 
pairs of samples more divergent than the pair recorded. In other words we can 
be. confident that the Liverpool Street and Whitechapel crania represent persons 

* An attempt to define the word ‘‘sameness” as used by writers on trypanosome strains would 
doubtless serve a useful purpose, and emphasise the fact that we can only define ‘‘sameness” by appeal 


to the theory of sampling, or by the adoption of some quantitative measure of the grade of likeness, 
+ See Biometrika, Vol. m1. p. 191 and Vol. v. p. 86. 


| 


Karu PEARSON 87 


MicRons. 
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | 
22 
20 
18 
16 |Z \ 
\ 
12 \ 
10 | 
28 | 
we a) | 2] 
4 
2 
4 
aa, Mzimba Strain of Trypanosome. 
bb. Strain from Hartebeeste. 
MILLEMETRES. 
120 130 140 150 
22 
20 
18 
2 16 lan 
/ \ 
12 | | 
10 of | / ‘ 
2 8 
4 NX 
2 "4 xe 
Ol 4 ~ 
| . ec. Cranial Breadths of Egyptian Women 6000 B.c. 


dd. Cranial Breadths of English Women 1680 a.p, 


Dracram I, 


. 
ie 
> 


88 A Study of Trypanosome Strains 


Mictemetres. 
170-5 1905 
T T 


nt 


© / \ 
8 
6 IN 
0 


cc. Whitechapel ¢’s. 
dd. Moorfields ¢’s. 
Dracram I bis, 
of the same strain. That P="25 and not, say, ‘85 may be merely a result of 
random sampling, or it may arise from some difference of period or social class. 


(2) In a long series of papers recently published in the Proceedings of the 
Royal Society, Section B, conclusions are reached as to the identity of various 
strains of trypanosomes. These conclusions are largely based on a comparison of 
graphs of the frequency percentages obtained by measurement of hundreds of 
trypanosomes. 

To some extent mean values are given for the different strains, but no argu- 
ments whatever can be based on them, for in no case has the probable error of the 
difference been calculated. Even if it had been calculated, this constant alone 
would not have sufficed to determine the sameness or difference of the strains. 
Further, the percentages of various forms in the strains are sometimes given; 
but again no attempt has been made to determine whether the differences of 
these percentages are or are not significant. It seems sufficient here to consider 
the far more valid test of the sameness or diversity of the frequency-distributions 
as a whole. 

I shall divide my investigation into four parts : 

(i) The probability of identity of the strains on the evidence presented in the 
reports of the Commission of the Royal Society, Nyasaland, 1912. 

(ii) The probability that the host or the animal in which the trypanosome is 
cultivated makes essential differences in the distributions of frequency. 


| 
| | 
| 


Kart PEARSON 89 


(iii) The probability that the strains are alike after allowance has been made 
for the host. 


(iv) The nature of the heterogeneity which is statistically demonstrable in 
the bulk of trypanosome measurements. 


I should like before considering the material to indicate one or two very 
important points. I am not concerned here with the truth or error of the con- 
clusions drawn by Sir David Bruce and his collaborators. I am only concerned 
with the nature of the process by which they have drawn their inferences. That 
process consists in a measurement of the individual trypanosomes and an appeal 
to the statistics of these measurements—in short to what I should term biometric 
reasoning. There may well be other means of discussing the resemblances of the 
different ..rains of trypanosome,—either by microscopic examinations of diver- 
gencies in the life history of the different strains or by differentiation in their 
action on different hosts, or otherwise. But in the present case the appeal to statistics 
of measurement has been made, Drs Stephens and Fantham in their paper on 
T. rhodesiense (R. 8. Proc. Vol. 85, B, p. 227) actually term their work a “biometric 
study,” and the later papers of Sir David Bruce and others are no less “ biometric.” 
Now if an appeal be made to statistics, then by a statistical method alone can the 
answer be given. Further, that method must be the analysis of the modern fully 
equipped and highly trained statistician. Such a statistician, and he alone, can 
assert or deny on the basis of statistics the probability of any of these strains 
of trypanosomes being samples of the same population; he alone is in a position 
to judge the value of the evidence provided by the frequency distributions. If he 
finds substantial “divergence” where Sir David Bruce and his collaborators 
assert “sameness,” then either statistical theory is wrong, or Sir David Bruce 
understands by “sameness” something quite different from the “sameness” of the 
statistician, and something which cannot be judged by the methods of statistics, to 
which accordingly no appeal should have been made, or only an appeal after a long 
series of control experiments. The “sameness” postulated by Sir David Bruce is 
something quite incompatible with the “sameness” found by the statistician when 
he investigates two samples of 100 crania of the same race or two samples of 1000 
blood corpuscles of two series of frogs of the same race, It is what the statistician 
calls marked divergence and not sameness. If it be asserted that the extreme 
divergence actually existing between the strains of trypanosomes statistically dis- 
cussed is due to difference of individual host and not to difference of strain, it will 
be clear that the divergence and not the sameness ought to have come out of the 
statistical investigation, and then control investigations ought to have been made to 
explain that divergence by environmental or other differences. But this is & priori 
to assume the identity of the strains and @ posteriori to seek an explanation of 
marked divergence deduced statistically, whereas in the actual papers this great 
divergence is assumed to be statistical sameness and this sameness used as an 
argument for identity of strains. The statistician coming to the data critically 

Biometrika x “12 


: 
4 
& { 
q 
q 
q 
q 
3 
€ 
{ 
— 
: 
+ 


90 A Study of Trypanosome Strains 


does not of course assert dogmatically that any two strains are not of identical 
race. What he does assert is that no argument for the sameness of the strains can 
be based on the statistics provided; for these actually show wide divergence, and 
he asks if the strains are d@ priori assumed to be “same,” for a full @ posteriori 
examination of the sources of the divergence. 


The scope of the present paper is not the complete investigation of all the data 
of the Royal Society Commission, nor an endeavour to obtain from the published 
data the full conclusions which may be legitimately drawn from them. Its purpose 
is to illustrate the statistical methods which ought to be applied to such material 
and to indicate the essential necessity of control experiments on strains known to 
be the same or accepted as different. A point should be noted here, namely, that 
I have only found. two cases where the strains on the basis of the statistical 
evidence are said to be different. The first is in the case of Trypanosoma evansi 
and Trypanosoma brucei. Sir David Bruce* gives (1911) the frequency distri- 
bution of lengths of 820 individuals of 7. evansi and compares it by means of a 
graph of percentages with 7. brucei. The percentages of the latter appear to be 
deduced from the lengths for two series of 160 trypanosomes and 200 trypanosomes 
cultivated in a variety of animals (Uganda, 1909, and Zululand, 1894) and pub- 
lished in the preceding year+, but no reference is given in the paper to the 
original of the percentages in the graph, nor is any demonstration given in the 
paper of 1910 of the statistical sameness of the Uganda and Zululand strains— 
there is merely said to be “ marked resemblance}” where the trained statistician 
finds marked divergence§. Stephens and Fantham|| use the curve of 1911 to 
assert that there is a “general resemblance between the curves representing 
the measurements of these trypanosomes (7’. gambiense, T. rhodesiense, T. brucei)” 
and consider that this “general resemblance” shows that “the method is a 
trustworthy one.” It is not clear what “the method” referred to really sig- 
nifies. The statistical comparison of means and maximum and minimum 
lengths without statement of probable errors, and the mere graphical exami- 
nation of frequency curves are wholly inadequate to determine sameness or 


* R. S. Proc. Vol. 84, B, p. 186, 1911. 

+ R.S. Proc. Vol. 83, B, pp. 5 and 11, 1910. 
t R. S. Proc. Vol. 83, B, p. 12. 

§ The two distributions are as follows: 


3 
13 | 14 | 15\16\17\18 | 19 | 20 | 21| 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32\ 33 | Totals 
Uganda, 1969 1| 2); 4) 6/10/26) 14/14/12; 9|12] 6/12/10! 7 1;2;3 3 160 
Zululand, 1894} 4 | 3} 11!11/20/32/17; 4) 4] 3) 5/3] 7] 7/10/13/13| 8/10) 8/3) 1 3) 200 
| 


These give x2=101-18, leading to P<-000,001, or not once in a million trials would two so divergent 
distributions be obtained by sampiing the same population, 
|| R. S. Proc. Vol. 85, B, p. 232, 1912, 


| 
| 
’ 


Karu PEARSON 91 


divergence, Only a month later than Stephen and Fantham’s paper’ appeared 
another paper by Sir David Bruce and others* comparing human trypanosomes 
from Nyasaland with 7. brucei and 7. rhodesiense and T. gambiense. But the 
curve for 7. brucei is wholly different from that of a year earlier. Instead of a 
minimum at 24 microns there is now a maximum at 24 microns, and the “ general 
resemblance” of 7. brucei to T. evansi is much increased. We are now told that 
T. rhodesiense (Stephens and Fantham) is “a distinct species, nearly related to 
T. brucei and T. gambiense,” and the conclusion drawn that “the human trypano- 
some disease of North-east Rhodesia and Nyasaland is not the disease known as 
Sleeping Sickness in Uganda and the West Coast of Africat.” But the divergence 
between the frequency distributions of 7. brucei and the human trypanosome of 
Nyasaland when accurately measured is of exactly the same order as that which 
suffices to demonstrate the identity of the human Nyasaland trypanosome and 
T’. rhodesiense. Thus the two cases in which divergence is asserted, i.e. (i) 7. brucei 
and 7’. evanst, (ii) T. brucei and 7. rhodesiense, seem to be differentiated largely on 
the base of unanalysed statistical evidence of a nature precisely like that which in 
other cases is interpreted to mean close “general resemblance” or “sameness.” 
We do not feel that we are in the possession of independent evidence of differen- 
tiation which would enable us to test how far statistical divergency corresponds to 
recognised morphological differences of strain,—a fundamental requisite if we are 
to interpret as “sameness ” a statistical divergence of an extremely high order. 

In concluding these introductory remarks we must refer to the types of 
trypanosome in Nyasaland recognised by Sir David Bruce and his colleagues as 
distinct on other grounds than numerical measurements. They are: 


(i) ZZ. brucet vel rhodesiense. This is said to be the cause of the human 
trypanosome disease of Nyasaland. The modal length appears to be 24 to 25 
micronst. According to Bruce and colleagues 7. gambiense appears to have a 
mode of 20 microns, but there is evidence for a submode at 26. 


(ii) TZ. pecorum. This is said to be the cause of trypanosome diseases of 
domestic animals in both Uganda and Nyasaland. The modal length varies from 
13 to 14§. There is no statistical evidence of bimodality. 


(iii) Z. simiae. This attacks monkey, goat and warthog. Oxen, dogs, white 
rats, etc., are said to be immune. The length distribution appears to be very 
homogeneous and with a single mode at 18 microns |j. 


* R. S. Proc. Vol. 85, B, p. 481, 1912. 

+ R. S. Proc. Vol. 85, B, p. 433,1912. In 1913, however, we find that ‘there is some reason for the 
belief that 7’. rhodesiense and T. brucei are one and the same species,” see Sir David Bruce and others, 
R. S. Proc. Vol. 86, B, p. 407. 

+ R. 8. Proc. Vol. 84, B, p. 331. Stephens and Fantham’s measurements on T. rhodesiense 
suggest modes at 20 and 26. Ibid. Vol. 85, B, p. 231. The double mode—roughly 18 to 20 and 
28 to 29—appears in the Zululand (1894) and Uganda (1909) strains of 7. brucei. Ibid. Vol. 83, 
B, p. 12. 

§ R. S. Proc. Vol, 82, B, p. 468, and Vol. 87, B, p. 14. 

|| R. S. Proc. Vol. 85, B, p. 477, and Vol. 87, B, p. 48. 

12—2 


| 


92 A Study of Trypanosome Strains 


(iv) . caprae. This is found in waterbuck, ox, goat and sheep. The dis- 
tribution of length is apparently homogeneous and the mode at 25 microns*. 

I leave out of account several forms of trypanosome referred to by Sir David 
Bruce and colleagues, e.g. 7. vivax, T. uniforme, T. ingens, etc., of which no large 
series of measurements were at my disposal. 

With the exception of 7. simiae, which occurs in the warthog, the above 
trypanosomes appear to be found generally in the wild game and all of them are 
found in the Glossina morsitans. Sir David Bruce and his colleagues suppose the 
differentiation into these classes to precede the consideration of individual strain, 
but the exact modus differentiationis is not clear from the memoirs. 


(3) Method of Investigation. The actual formula employed in the present 
investigation is very simple and can be applied by anyone able to do ordinary 
arithmetic. If N and N’ be the sizes of two samples and the corresponding 


frequencies : 


where /,, f, are the frequencies falling in the p™ category, then if 


Sette 

be calculated, the probability P that the observed or a greater divergence between 
the two series would arise from sampling the same population is obtained by 
determining P from x? by my method of testing “goodness of fit.” This method 
was first published in the Phil. Mag., Vol. 50, p. 157; 1900. The shortest method 
of actually determining P is by aid of Palin Elderton’s tables for P with argument 
x’ issued in Biometrika, Vol. 1. p. 155, 1902. This is the process used in the 
measurements of sameness and divergence provided below. 


(4) On the Probability of the identity of the Strains discussed by Sir David 
Bruce and others. 

(a) I take first the question of the “sameness” of the Wild-game strains of 
trypanosomes as isolated from five antelopes—reedbuck, waterbuck, oribi, and two 
hartebeeste. Sir David Bruce and others discuss these strains in a papert of 
February, 1912, and conclude, apparently from the statistical data, that “the five 
Wild-game strains resemble each, other closely and all belong to the same species.” 


Now these Wild-game strains have a distinct advantage for they are all 
obtained from the trypanosomes ultimately taken from the rat as host; they were 
passed from the infected antelope through healthy goat, monkey or dog, which 


* R. S. Proc. Vol. 86, B, p. 278. 


+ R. S. Proc. Vol. 86, B, p. 407, 19138. In the Table p. 405 for 2500 trypanosomes under the 
heading 31 microns read a frequency of 33 not 53. 


E 
I 


| 


Kari PEARSON 93 


became infected, to the rat. The frequencies of lengths of the trypanosomes in 
microns were as follows: 


From Rat 15 


16 | 17 | 18| 19 | 20 |21 | 22 | 28 24 | 26 | 27 | 28, 20 | 30| a1 | 33| 84 | 35 Totals| 
Hartebeeste (1) ...|—|!—|—| 53| 80|92|53| 46! 45 | 98|25|21|10|19_ 66 2 |—|11] 500 
Hartebeeste (2) ... 11|30| 47] 79|51|51| 44) 37 36/35 28/24/11| 3 1)—| 500 
Oribi.. —| 1 | 10]22/77) 109/90 57} 2819) 23/15) 21/14] 6| 5) 2 |-./1|—|—] 500 
Waterbuck —!1]| 2] 8) 26| 59) 74|58|58| 44 | 9 | 3| 2 |—|—] 500 
Reedbuck —| 5 |80]57| 90] 81 53) 27/17) 16/18 | 23) 18 | 25) 18 | 500 

: | | | 

14| 41} 91 79 |56 22/19 16 15/9 | 2 -|-|- 500 


I questioned first whether the strains found in the two Hartebeeste were the 
same; they give 


x? = 108°69, and therefore P < -000,000,1. 


In other words not once in 10,000,000 trials would two such divergent samples 
arise if the Hartebeeste strains were samples of the same population. I now 
compare the Waterbuck and the Oribi; these provide y* = 109-25 and P <000,000,1, 
and again the extraordinary divergence, not the sameness, is the statistical 
feature. The reader may rest assured that equally incompatible results arise 
when we compare the other antelopes. Statistically we are compelled to assert 
either that the trypanosome strains in these different antelopes were different 
species, or that, not only the infected species of antelope, but the individual 
antelope of the same species (as in the case of the two Hartebeeste) immensely 
modifies the strain of trypanosome. In short not the “sameness” of the strains, 
but their great statistical divergence is the fact which impresses itself on the 
biometrician. No biometrician could possibly accept the view of Sir David Bruce 
and his colleagues that* : 


“Tt is evident from these tables and charts that the various strains of this 
trypanosome, as they occur in wild game are remarkably alike. This is what 
might be expected. Here the trypanosome is at home; it is leading a natural 
life. It may be supposed to be saved from variation by constantly passing and 
repassing between the antelope and the tsetse fly.” 


Our authors, it will be noted, directly appeal for “likeness” of strains to the 
tables and charts. 


With these immense measures of statistical differentiation, we ask : what would 
be the values of y* and P, if examples of differentiated strains of trypanosomes could 
be found? If differences of host or treatment can produce these wide divergences, 
how without a preliminary study of the same strain in different hosts and under 
different treatments can we be certain whether these large divergences mean the 
same strain differently treated, or different species of trypanosomes ? 


* R. S. Proc. Vol. 86, B, p. 406. 


94 A Study of Trypanosome Strains 


(6) The next comparison I make is between the Mzimba (Donkey) Strain 
taken through rats and the above wild-game strains. I have added the data for 
the Mzimba Strain to the last table (p. 93): it is given by Sir David Bruce and 
others in a paper on the Mzimba Strain*. I compare the Reedbuck and the 
Mzimba (Donkey) strains first. We find: 


x? = 53:37, P =-000,05. 


Thus only once in 20,000 trials would a divergence as great as this arise, if the 
two strains were samples from the same population. 


The results of comparing the Mzimba strain with Waterbuck and Hartebeeste (1) 
give respectively 
x? = 11423, P=<:000,000,1, 
and 17100, P=< -000,000,1. 


These give for practical purposes impossibility of a common source, thus still 
further demonstrating that the marked feature of the wild-game and Mzimba 
strains is divergence, not sameness. 


Sir David Bruce and his colleagues writet : “The trypanosome of the Mzimba 
strain is the same species as that occurring in the wild-game inhabiting the 
Proclaimed Area, Nyasaland.” In an earlier paper a diagram? is given of the 
frequency distribution of 3600 trypanosomes of Human strain taken from the rat 
alone. These are drawn from four native cases of sleeping sickness in Nyasaland 
and from one European case from Portuguese East Africa. As the individual 
cases for the rats alone are not given, they have had to be read off the per- 
centage diagram, but the frequencies must be very nearly correct. This Human 
strain may be compared with the 7. rhodesiense, the T. brucei, the Mzimba 
(Donkey) strain and a strain obtained from a native woman suffering from 
“ Kaodzera,” the so-called sleeping sickness of Nyasaland. The frequencies of 
these five strains are given in the following table. I first compare the trypano- 
somes of Nyasaland given as (b) above with 7’. brucei and T. rhodesiense, for this 
is the comparison made by the authors themselves§. 


Taking the trypanosomes of Nyasaland (b) and the 7. brucei as figured in 
percentage curves by Bruce and others, we have 


x°=7217, P<-000,000,1, 


or it is impossible to ascribe any degree of sameness to these two strains. We 
now compare the Nyasaland strain (b) with 7’. rhodesiense, and find 


x? = 69°95, P=-000,01; 


* R.S. Proc. Vol. 87, B, p. 31, 1913. 

+ R. S, Proc. Vol. 87, B, p. 34. 

t R. S. Proc. Vol. 86, B, p. 301. 

§ R. S. Proc. Vol. 85, B, pp. 431 and 433, 


M 
H 
H 
7 
| 7 


| 


PEARSON 95 


thus once in 100,000 trials two such divergent samples might be drawn. Although 
there is less divergence than in the case of 7. brucei and Nyasaland (b), it is idle 
to speak of such a degree of divergence as sameness. 


Length in Microns. 


| 
14|15) 16) 17) 18 | 19 | 20 | 21 | 22 | 23 | 24) 26 | 27 | 28 | 29 
| | | 
| Mzimba (Donkey) (a) —'—|—|—| 2 14| 41) 91] 79} 56| 53| 38] 39 22 | 19} 16) 15| 9) 
Human, Native Woman (| —|—/} 1] 4/19) 42] 63] 81] 75] 91] 65) 66] 93] 91 |107|110|}104] 87) 
Human, mixed (c) | —| 4) 46) 111] 159 | 219 | 288 312 | 365 | 359 | 314 | 314 | 231 | 218/198 
T. brucei... ve | | 8|14,17 40| 63) 55) 66| 63) 75) 93 80 50) 38 
| rhodesiense 3/10/19 29 35| 67| 54] 92| 51) 74] 56] 68) 59| 85} 61] 72] 50 
| 
* 
Length in Mierons, ~—(continued). 
| | | | t- | 
| 80 | 3 33 | 84): 36 | 37) 38 | 39 Totals Remarks | 
| Mzimba (Donkey) (2) .../ 2| 2] S. Proc. Vol. 87, B, p. 31. 
| Rats only. i 
Human, Native Woman (6) 27|23/13| 1) 1 |—|—|—| 1220] &. S. Proc. Vol. 85, B, p. 427. | 
| Various hosts. | 
Human, mixed (c) 125 | 90 | 59 | 30/13) 8 | 2 | 2 | —| 3600 | R. S. Proc. Vol. 86, B, p. 301. Read | ni 
from diagram. Rats. 
brucei 27| 26/18/11] 4| 4) —|—|2|--| 1000/2. 8. Proc. Vol. 84, B, p. 331. 
= Read from diagram A 
T. rhodesiense 52] 28/13 5] 1] 1) Proc. Vol. 85, B, p. 227. @ 
| Various hosts. 


To further establish our point let us compare the Human strain (c) for 3600 
trypanosomes with the 7. rhodesiense. Here y* = 325°47 leading to P < ‘000,000,01. 
In other words the great degree of divergence for the case of the Nyasaland native 
woman is exceeded at least a thousand times, when we take the big example of 
four natives and one European. | 


Sir David Bruce and his colleagues write of these strains : 


“(1) The trypanosome of the human trypanosome disease of Nyasaland is 
T. rhodesiense (Stephens and Fantham).” In other words the P = ‘000,01 is inter- 
preted as sameness. 


“(2) This is a distinct species, nearly related to 7. brucei and T. gambiense, 
but more closely resembling the former than the latter.” In other words they at 
this date distinguished between 7. brucei and T. rhodesiense*, and as a result of 
this distinction proposed to call the human trypanosome disease of North-east 
Rhodesia and Nyasaland by the name “ Kaodzera” as not being identical with the 
sleeping sickness of Uganda and the West Coast of Africa. If we, however, 
compare 7’. brucei and T. rhodesiense we find x* = 46°83 and P="019. In other . 


* R.S. Proc. Vol. 85, B, p. 433, 1912, 


q 
i 
‘ 
. 
. 
( 


12\13 


96 A Study of Trypanosome Strains 


words once in about 50 trials we might expect to get two samples from the same 
population as divergent or more divergent than the distributions found for 
T. brucei and T. rhodesiense. We have in fact in the cases of these two trypano- 
somes reached our first instance of comparative sameness, and the statistics should 
have shown Sir David Bruce and his colleagues that 7. brucei and T. rhodesiense 
were relatively the same, and though both differed from the human trypanosome 
of Nyasaland widely, the approach to 7’. rhodesiense was only slightly closer. 


The accordance—speaking in a relative sense—of 7. rhodesiense and T. brucei 
was asserted by Stephens and Fantham in March, 1912*. In May, 1912, Bruce 
and others, speaking of the 7. rhodesiense, term it a distinct species; in February, 
1913, they say—although without publishing further frequency distributions— 
that “There is some reason for the belief that 7. rhodesiense and T. brucei 
(Plimmer and Bradford) are one and the same species,” + and in a further paper of 
the same month, “Evidence is accumulating than 7. rhodesiense and T. brucei 
(Plimmer and Bradford) are identicalt.” In May, 1913 (R. S. Proc. Vol. 87, B, 
p. 34), we are told that the Mzimba strain is identical with the wild-game strain 
and that “it has already been concluded that this species is 7. brucei vel T. rhode- 
siense.” As far as the statistics of the subject go the only really weighty evidence 
for the identity is that of 1912, on which, without statistical analysis, the 
distinction between the two species was asserted. 


(c) We will next consider the possible identification of 7. gambiense with 
T. rhodesiense and with 7’. brucei. 


The second identification is suggested by Sir D. Bruce and others in the words§: 


“Whether these slight differences are fundamental or only accidental it is 
impossible at present to say, but enough has been written to show that Trypano- 
soma gambiense and Trypanosoma brucei approach each other very closely in 
shape and size.” 


The following table|| provides the data for 7. gambiense to be compared with 
the distribution of 7. rhodesiense ranging from 12 to 39 in the last table. 


Microns. 


16, 17|18| 19 | 26 | 21 


| 
22 


15 


49 


The trypanosomes are from a variety of hota, 
For the 28 classes we have, x?= 140:27 and P <-000,000,1. The chief point 
therefore is the complete divergence, not the resemblance of the two series. 


* R. S. Proc. Vol. 85, p. 238, 1912. 
+ R. S. Proc. Vol. 86, B, p. 407. 
+ R. S. Proc. Vol. 86, B, p. 302. 


9 


2156) 79 114| 122 


§ R. S. Proc. Vol. 84, B, p. 332. 
|| R. S. Proc. Vol. 84, B, p. 330. 


29| 30) 31 32 33 | 34 | 35 | 36 | 39 | Totals 


age 20/11) 4 -|-|-|-| — 1000 


| 


| 


KARL PEARSON 97 


Stephens and Fantham, who term their work a “biometric study,” speak of 
“the general resemblance between the curves representing the measurements of 
these three trypanosomes (7. gambiense, T. rhodesiense, T. brucei).” They con- 
tinue: “We do not consider, however, that identity of measurement would 
necessarily imply identity of species. We still believe that the difference in 
internal morphology, namely the presence of the posterior nucleus, is sufficient to 
separate 7’. rhodesiense both from T. gambiense and T. brucei*.” As a matter of 
fact the “ biometric study ” of the data does not indicate identity in the measure- 


ments, but confirms the result of internal morphology by proclaiming wide 
differentiation +. 


(d) We can now compare 7. brucei and 7. gambiense. Of these Sir David 
Bruce writes: “ Whether these slight differences are fundamental or only acci- 
dental it is impossible at present to say, but enough has been written to show 
that Trypanosoma gambiense and Trypanosoma brucei approach each other very 
closely in size and shapet.” The biometric commentary on this is that for length 
of the two series yx? = 126°52, giving P< 000,000,1 and that as far as size is 
concerned the samples differ immeasurably, i.e. far beyond the limits of the 
calculated tables of P. 


We should thus conclude, merely from the statistical evidence, for close same- 
ness in 7. brucei and 7. rhodesiense but for marked divergence of both from 
T. gambiense. 


* R. S. Proc. Vol. 85, B, p. 233. 

+ In a later section of this memoir I show that Stephens and Fantham have been markedly biased 
in their judgment of even and odd units of measurement (p. 129 below), and that the recognition of 
this makes a wide difference in the goodness of fit of my resolution into components to their data for 
T. rhodesiense. It seems desirable therefore to inquire whether this bias affects the test of ‘‘sameness” 
of T. rhodesiense with T. gambiense, T. brucei, and the Human strains (b) and (c), see the Tables 
pp. 95—6. The data were accordingly classified into groups of two microns, starting with 12 and 13, 
14 and 15, ete., so as to get rid of the even bias as far as possible, and we find : 


Old Unit Ranges New Two Unit Ranges 


| 
Strains compared 
| x? P 


T. rhodesiense and gambiense 98 | 140-27 <*000,000,1 118-73 | < -000,000,1 
T. rhodesiense and brucei ... 28 46°83 14 25°76 
T. rhodesiense and Human 

| 

| 

| 


strain (b) 69°95 000,01 14 45°92 “000,06 
T. rhodesiense and Human 
strain (c) wae 28 325°47 <+000,000,01 14 253°37 < -000,000,01 


The bias towards even numbers of Stephens and Fantham has thus not substantially influenced our 
results, which still show the relative likeness of 7’. rhodesiense and T. brucei, and the marked divergence 
of the former from T. gambiense and the human strains. 

t R. S. Proc. Vol. 84, B, p. 332, 


Biometrika x 13 


i 


| 
| 
| 
| 
| 
| 
| 


98 A Study of Trypanosome Strains 


(e) It seemed well worth while to investigate how far the two Nyasaland 
strains of Human Trypanosomes given in the table on p. 95 agree or differ. The 
first (b) of these strains from a native woman of Nyasaland may be compared with 
(c) a compound strain from four natives and a European. We find 


x? = 172°36 
giving P < :000,000,1. 


In other words, the two Nyasaland strains from human beings are indefinitely 
differentiated. I now compare the Mzimba (Donkey) strain* (a) with human 
strains (b) and (c), we find: 


for (a) and (b) x? = 22316 giving P certainiy < ‘000,000,01 ; 
for (a) and (c) = 348°55 < 000,000,01. 


Thus the trypanosome strain found in the donkey appears to be absolutely 
incomparable with that found in man in Nyasaland, just as the strain found 
in the donkey differed from that found in wild-game. 


(f) We may now turn to a memoirt by Sir David Bruce and others com- 
paring the Mvera cattle strain, the wild-game strain, and the wild Glossina 
morsitans strain. They give on p. 13 of that paper the graphs for 500 specimens 
of T. pecorum, the wild-game strain, and of the wild Glossina morsitans strain taken 
from a variety of hosts. The following are the frequencies: 


Microns. 

Strain 9 | 10 | 11 | 12 | 18 | 14 | 15 | 16 | 17 | 18 | 19 | Totals 
Mvera Cattle Strain | 15 | 64 | 101] 1386 59 8/1 —| 500 
Wild-Game Strain ... —j| 34] 85) 172) 119) 63 | 22| 3 500 
Wild G. morsitans Strain ... | 1 | 4 | 16 | 42 | 129] 147|103| 42 | 15) 1 | —j| 5600 

| 


We compare first Mvera cattle strain with the wild-game strain and find for 


our 10 categories 
x? = 34554, P = 000,243. 


This is a relatively low degree of divergence considering that P has been running 
into 1 in 10,000,000! But it means that if these two strains were samples of one 
and the same population, we should only expect two such divergent samples to 
occur 1 in 4000 trials. 


* This Mzimba strain of trypanosome is discussed in a paper headed: ‘ Morphology of the various 
strains of Trypanosome causing Disease in Man in Nyasaland.—The Mzimba Strain’ (R. S. Proc. 
Vol. 87, B, p. 26); it is said to be of the Nagana type and is identified by Sir David Bruce and 
colleagues with 7’. brucei vel rhodesiense, the souxce of the human trypanosome disease. 

+ R. S. Proc. Vol. 87, B, p. 4. 


> 
j 
d 
| 
| 
# 
‘a ; 
| 
| 
| 


Karu PEARSON 99 


Next we find for Mvera cattle strain and the wild Glossina morsitans strain, 
x? = 40°508, or P =-000,008, 


or only once in 125,000 trials would a pair of samples so divergent arise when 
testing the same material. 


Lastly, testing the resemblances of wild-game strain and wild G. morsitans 
strain, we find 
= 35°41, or P = -000,2, 


not such a gigantic divergency as we have found in many cases, but a difference 
so great that it only occurs once in 5000 trials requires explanation as cgay 
and cannot be used as an argument for “sameness.” 


It will thus be quite clear that as far as the measurements of length go, there 
is wide divergence to be accounted for between the trypanosomes found in the 
cattle, the wild-game and the tsetse fly, and that statistically this divergence is the 
remarkable feature. Yet the conclusion of Sir David Bruce and his colleagues, 
arguing very largely from the frequency distributions, is that “The Mvera cattle 
strain, the wild-game strain and the wild G. morsitans strain belong to the same 
species of trypanosome, 7’. pecorum*.” 


It will be seen that actual statistical analysis does not in any way confirm the 
bulk of the conclusions reached by Sir David Bruce and his collaborators. The 
strains may or may not be ultimately of like origin, but what is quite clear from 
the analysis is that, if we are to rely on the measurements, then it is the diver- 
gence, not the sameness of these strains, which should have been emphasised. 
No stronger evidence could be deduced of the danger of appeal to statistics 
when the statistics are not handled by the trained statistician. The mere appeal 
to the resemblance of frequency curves given in the form of percentages, often 
based on widely different totals, is an only too common error of medical investi- 
gations ; it is by no means confined to the Scientific Commission of the Royal 
Society, Nyasaland. But it has recently become so marked a feature of Series B 
of the Proceedings of the Royal Society, that a vigorous protest is really needful. 
Thus in the very last part issued (Vol. 87, B, p. 89) occurs a paper on “ The 
Trypanosomes causing Dourine.” In this paper there may be microscopic evidence 
to differentiate the strains A, Band C dealt with; on that I cannot express an 

* A further conclusion is also reached (Ibid. p. 26) ‘‘ 7’. pecorum, Nyasaland, is identical with the 
species found and described in Uganda.” Unfortunately the species found in Uganda is dealt with 
in a paper (R. S. Proc. Vol. 82, B, p. 468) which provides no frequency distributions, and does not tell 
us the total number on which the mean length—13°3 microns—is based. The mean value of the 
T. pecorum, Nyasaland is 13-954 (R. S. Proc. Vol. 87, B, p. 3) and the standard deviation is 1:393 in 
microns, thus the probable error of the mean is ‘67449 x ‘0623. Assuming the Uganda trypanosome to 
be the same strain and to have the same variability as the 7’. pecorum, Nyasaland, the difference of the 
means = °654, with a probable error of °67449 x V2 x 0623 =-67449 x -088, thus the deviation of the means 
is 7°73 times its standard deviation. A deviation so great would only occur about once in 4 x 10’ trials, 


i.e., would be practically impossible if the two strains were identical. Here again it is excessive 
divergence not sameness which the statistics indicate. 


13—2 


| 
} 
| 
| 
‘ 
q 
| 
\ 
| ‘ 
| 


100 A Study of Trypanosome Strains 


opinion. But on pp. 92—3 percentage frequency curves are drawn for the three 
strains, and the following remark is made : 


“A survey of the curves obtained by plotting out in percentages the various 
lengths of trypanosomes encountered in each of the three strains is of interest. 
It will be observed that in the case of rats the curves of each of the strains corre- 
spond fairly closely.” 


Now what do the authors mean by “fairly closely”? In their conclusions 
they identify B and C and differentiate A. Unfortunately they have not given 
their actual frequencies, and I have had to endeavour to reconstruct them from 
the percentage curves. There results for the rat-data : 


Microns. 

16|17| 18 | 19 20) 21 | 22| 23; 24| 25 | 26 | 27 28; 29 31| 82) 33 | 34) 35| 86 Totals 

| | | | | j | 
Berlin Strain A 1/1/10} 9 12 17|17| $8 | 98 48 |47 (57 55/42 |39 8 500 
Frankfurt Strain B ...} —|—| 1/3) 5) 1} 4/10 20 22* | 18* 25 24 | 35* 23!15/18/15) 8 | 3 250 
East Prussian Strain C | —- 4|3 27 | 37 | 31 | 16 10| 7| 2|— —j|—{ 260 
| | | 


We obtain the following results : 
Strains A and B: y’?=31:11, P=-0627, 
Strains A and P= 0034, 
Strains Band C: y*=72'72, P=< 000,001. 


Thus to judge from rats only, B and C are far more divergent from each 
other than either is from A; in other words the strain A is intermediate between 
B and C and closer to B, from which it is not immensely divergent; two such 
samples as A and B might, as far as the length distributions go, be drawn from 
common material once in 16 trials. 


Now of course no one suggests that a conclusion drawn from this rat-material 
is to replace one drawn from guinea-pig material, but the statistician cannot agree 
that for rats “the strains correspond very closely ” ; and he finds it illogical to place 
the evidence of the rat-data on one side and proceed to draw conclusions from the 
ocular inspection of the guinea-pig curves, without noticing that the conclusion is 
markedly opposed to the proper deduction from rat-data. Indeed while the guinea- 
pig-datat give a relatively high degree of relationship between B and C (P ='0157) 
it is not as high as the rats give between A and B (P =‘0627); and while the 


* The values given by the percentage graphs in these cases are respectively 21, 17 and 34, and 
the total appears to be 247 and not 250 as stated. Either 247 were used or the graph is in error. 
The three individuals were introduced in a way calculated not to increase divergence. 

+ The frequency distributions for the guinea-pigs have had to be reconstructed from the percentage 
curves, the necessary data not being published by the authors. 


PEARSON 101 


relationships of A and B (P<-000,000,1) and A and C (P< 000,001) are very 
low, the origin of the second hump in the guinea-pig distribution for A requires 
much more analysis and the certainty by control experiments, that it always 
repeats itself, and is not the result of hitting a “ pocket.” 


It seems to me that any statistical analysis by modern methods of the trypano- 
some data compels us to confess that either statistical methods must be discarded 
entirely in these trypanosome investigations, or they must be pushed to their 
logical conclusion, and used as the fundamental instrument of research which can 
guide our enquiries by inference and suggestion when, and when only, it is handled 
by the trained craftsman. Thus far the use made of statistical methods seems 
merely to have confused the issues, and brave would be the man who would venture 
to say after reading this section of our present paper that any two strains discussed 
by the commission are definitely “same” or certainly differentiated. 


(5) On the Probability that the Animal in which the Trypanosome is cultivated 
makes essential Differences in the Distributions of Frequency. 


But the very method which casts apparent discredit on the results at present 
reached seems able to lead us to definite conclusions provided we start with it as 
the fundamental mode of investigation. Really very little inspection seems to indi- 
cate that not only the host but the period of infection materially influences the 
frequency distribution. These points have not been wholly disregarded by the in- 
vestigators in this field, but they have had no quantitative measure by which they 
could appreciate the relative influence of the various environmental factors. Nor 
indeed could the method be fully applied without experimental observations on 
trypanosomes of the same strain subjected to differential treatment. Knowing in 
such cases the quantitative divergence produced, we should be in a position to infer 
whether two strains from different sources were separate species or merely modified 
by differential environment. Until we have such quantitative measure no hypothesis 
of sameness or difference can flow from statistical treatment; nobody as yet knows 
how much to attribute to environment, how much to attribute to individuality 
of strain. 


In endeavouring to throw light on this matter we are, however, checked at 
the very start by the absence of effective material. In some cases the period of 
infectivity is not given; in others we are not always able to break up the total 
frequency by reference to the host, or to a single host. And even when we merely 
classify by one type of animal as host, we may have reduced our material to such 
small numbers that samples may be “same,’ which on larger numbers would 
show the marked divergence due to the emphasis of smaller differences*. Some 
suggestive points can, however, be effectively dealt with and they are treated in 
the following paragraphs. 


* It may not be possible to differentiate Bavarian from Wiirtemberger on samples of 50 crania, 
although quite possible on samples of 400. 


| 
: 
| 
| 
| 
ii 
| 
) 
| | 
| 


102 A Study of Trypanosome Strains 
(a) I ask what difference is made when a strain is passed through various 


animals (goat, monkey, dog, rat) or through a single animal alone. Taking the 
wild-game strain discussed by Sir David Bruce and others*, we have: 


Microns. 


| 
| 


11 | 4 | 15 16 | 17 | 18 | Totals 


‘res 
Wild-Game Strain} 2 | 34/ 85/172 | 119 | 63 | 92 | 3 | 500 
(from various animals) | 
| 
| | 


| | | 
Wild-Game Strain 
(from a single rat 510) so) | 137 | 168 | 327 | 


Here we find y* = 65°37 and P < 000,000,1. In other words the distribution of 
lengths of the trypanosomes of the wild-game strain obtained from various animals 
differs so enormously from that obtained from a single rat that the two cannot be 
looked upon as samples of the same population. The moment this result is realised 
we appreciate that (i) it is impossible to compare two strains developed in a variety 
of animals unless we have previously tested on the same strain the equal valency 
of these animals, (ii) a series of animals of even the same species may quite 
possibly give widely divergent results from those obtained for a single animal. 
Thus passing from a variety of animals in wild-game strain to a variety in wild 
G. morsitans strain makes less difference (P = ‘000,008)—although great enough— 
than passing from a varicty of hosts to a single rat in the wild-game strain. 
This rule is not universal, but it illustrates the absolutely essential need for 
testing the effect of change of host before questioning the identity or non-identity 
of two strains. 


(b) I now turn to the Mvera cattle strain, and ask what differentiation is 
produced by the dog and goat as hosts. The data are very sparse and unless we 
get a high degree of resemblance may be worth little. They runt: 


* R. S. Proc. Vol. 87, B, pp. 6 and 8. 


+ R. S. Proc. Vol. 87, B, p. 3. I tested the relative interchangeability of goat and sheep in the case 
of T. caprae. The data are as follows: (R. S. Proc. Vol. 86, B, p. 280) 


Microns. 
| TT. caprae 25 | 26 | 27 | 28 29 | 30 | 31 32 | Totals | 

85/43 | 50 | 33 | 28/27/17) 5 | 1 | —| 260 
Sheep 10 12 | 29 | 39 | 31 | 28 | 20 5) 8 1) 1) 180 | 


leading to x?=18-088 and P =-1133 or the resemblance is considerable although not so great as we find 
between goat and dog for the Mvera cattle strain. 


= 
| 

| 


PEARSON 103 


Microns. 
| 9 |10| 11 12| 13 | 14 15 | 16 17 | Totals 
| | | 


We have y* = 5°396 leading to P= ‘714, or in 71 pairs of samples out of 100 
from a homogeneous population, we should get more divergent results. It follows 
therefore that, as far as these small series of this strain go, goat and dog are 
interchangeable as hosts. 


Let us go a stage further and ask whether ox is interchangeable with goat and 


dog. The following is the frequency distribution for the trypanosomes through 
the ox: 


Microns. 
| | | | ie | | 
9 } 32 14 | 15 | 16 18 Totals | 
Mvera Cattle,Ox ... | — a a 7 8 | 33 | 44 49 21 | ¥ | l 180 | 
| 


data with the goat strain, this gives 


x? = 9°559 and P = "3888, 
and compared with the dog strain 


x?= 9461 and P=-3973. 


Thus in about two out of five trials from a same population we should get 
pairs of samples differing more than the dog and goat strains do from the ox 
strain. We conclude that while for practical purposes dog, goat and ox strains in 
the Mvera cattle trypanosomes are interchangeable, yet the dog and goat strain 
are nearly twice as much alike as the ox strain is to either. Lastly—although it 
is rather a rash proceeding—I compare rat with goat and dog. It is rash because 
only 40 trypanosomes through the rat were measured, and this is wholly inadequate 
for real determination. The frequencies for the lengths are: 


| 9 | 10) 1: | 12 (1s | | 15 | 16 | 17 | Totals 
| | 
Mvera Cattle, Rat . — | | 
» Dog and Goat | | 25 | 49 | 56 | 40| 21, 1 | 200 
| | 


We find = 21'329 and P=‘0064. The small series of rat trypanosomes 
probably accounts for no smaller value of P, but the odds of 155 to 1 are 
sufficient to show that rat series must not be mixed with series from the goat, 


104 A Study of Trypanosome Strains 


dog or ox. This confirms the view obtained for the wild-game strain, that a 
strain taken through the rat as host is incomparable with strains from other 
animals. 


(c) The totals considered for one species of host in (a) and (b) are rather 
small, Larger numbers are forthcoming for the so-called Mzimba strain of 
trypanosomes taken from a donkey at Mzimba. The frequencies are here*: 


Microns. 


| | | | = | 
| 27 18 | 19 20 21| 22 | 23| 24! 25 26 | 27 | 28 29 81 | 32 | Totals, 
Mzimba Strain, Dog | 3 8 17/56 69 67 47 27/22) 12; 7 4) 4 | 
» Rat|2 14 41 91 79 56 od Sa 16/15! 9 | 
| 


| 


We find y? = 25-499 and P=-0619. Thus only about once in 16 trials should 
.we get such a degree of divergence as the two samples present, drawing them from 
the same population. This is very far from such a divergence as we have noted 
in the rat and dog for the Mvera cattle strain, or in the case of rat against other 
animals in the wild-game strain, which was extremely large. The only expla- 
nations that occur to me here are: 


(i) In the case of the wild-game strain and the Mvera cattle strain a single 
rat seems to have provided all the trypanosomes, while in the case of the Mzimba 
strain two rats were used ; this might lessen the influence of individuality. 


(ii) In the case of the Mvera cattle strain and the wild-game strain the 
trypanosomes were ultimately taken from a great number of individuals. In the 
Mvera cattle case we are told that 32°/, of the herd were affected, and we have 
some details of 16 head of cattle and 5 donkeys naturally infectedt. In the wild- 
game case, the wild game affected were very numerous, covering cases of eland, 
reedbuck, waterbuck, bushbuck, oribi, koodoo, hartebeeste, buffalo and hyaena. 
Now can we start with the hypothesis that all the individual cattle and all the 
individual wild game were each bitten by a fly carrying the same strain of 
trypanosome? Have we any more right to suppose @ priori that one wild- 
game strain of trypanosome and one cattle strain of trypanosome exist, and ask 
whether these two are identical, than to ask whether the strains carried by hyaena 
and hartebeeste are the same? We have already (p. 93) seen that the strains 
from two hartebeeste are extremely divergent. What right have we @ priori 
to classify all wild-game trypanosomes together and call them a wild-game strain ? 
And if two antelopes, whether of the same or of different species, give widely 
different results, why are the trypanosomes of oxen of the same herd or donkeys 
and oxen from the same neighbourhood to be classed d priori as of one species ? 


* R. S. Proc. Vol. 87, B, p. 31. 
+ R. 8. Proc. Vol. 87, B, p. 15. 


| 


Kart PEARSON 105 


If we turn to the Mvera cattle, we find there were four sources of trypanosomes 
for the ox, two for the goat, and the same two for the dog—these two sources being 
two of the four cattle sources. Thgre was only one source for the rat, but I have 
not discovered how far it was identical with one of those for ox or goat*. In the 
Mzimba donkey strain there was one source for dog and rat. In the wild-game 
strain there were, I make out, eight sources of trypanosomes for the goat, four for 
the dog, and only one for the ratf. 


Thus the individuality, which might be supposed to influence the result, 
because we are treating of trypanosomes in this case from a single rat, in the 
Mvera cattle case from a single rat, and in the Mzimba data from only two rats, 
may really arise from the fact that the rat strains in each case are derived from a 
single source, while the dog, goat and ox strains show a multiplicity of sources. 
The troublesome point is that the experimental part of the work has not been 
designed to answer what seem to me fundamental questions. We cannot directly 
inquire what difference the host makes because different hests have rarely been 
treated with the strain from a unique source. We can say that dog and goat are 
interchangeable for the Mvera cattle strain, because both drew trypanosomes from 
the same two sources ; but we cannot determine whether the difference in the ox 
is due to difference of the host, or to the introduction of two more sources. Simi- 
larly the divergence between the trypanosomes from rat and from other animals for 
the wild-game strain may be due to using one rat and therefore one source, and not 
the many sources of the other animals, or it may really be due to the differentiation 
of the host. In the same way the difference between the two hartebeeste may be 
due to individuality in the same species, or to infection from different strains. 


(d) To some slight extent we may appreciate the effect of individuality by 
comparing the two rats 512 and 513 in the case of the single source, the Mzimba 
straint. 

The frequencies are as follows : 
Microns. 

| 


pest 
21 | 22 | 23 | 24 | 25 | 26 27 | 28 | 29| 30 31 Totals 


Mzimba Strain 16\17 18 19 | 20 


| | | | | 
Rat 512 | —| | 17] 37) 36 | 26 | 25 18 | 22 


1 2] 240 
Rat 513 9 | 24/54) 43 8/10; 5|6/2)1 260 
The numbers are not as large as we should like; but they give 
x? =17:39, P ='3306. 

* R.S. Proc. Vol. 87, B, pp. 2 and 15. 

+ R. S. Proc. Vol. 87, B, pp. 6 and 8 compared with 5. Rat from p. 8. 

t R. S. Proc. Vol. 87, B, pp. 29 and 31, 
Biometrika x 14 


106 A Study of Trypanosome Strains 


Clearly then two samples as divergent as those found would occur on the 
average once in three trials. It follows that two individual rats are really inter- 
changeable and we note that the extent to which ox is interchangeable with dog 
or goat for the cattle strain is very much the degree in which two rats are inter- 
changeable. To judge from this single instance, individuality within the same 
species of host is not very important, and when we find two hartbeeste differing 
as those considered on p. 93, it seems much more likely, with the information we 
have at present got, that the hartebeeste were infected with different strains of 
trypanosome than that their individuality produced the enormous divergence 
noted. Again the sensible divergence between Mzimba strain in dog and rat on 
p. 104 is probably due to difference of host, but the enormous difference in the 
wild-game strain between a single rat and dog and goat on p. 103 is probably due , 
to differences in the strains of trypanosomes in the various types of wild game 
dealt with. We may consider whether the dog and goat data for the wild-game 
strain differ sensibly. We have* 

Microns. 


16 | 17 | 18 | Totals | 


| 11 | 18 | 15 | | 


Wild-Game Strain, Goat ... | 1 | 16 | 37 | 73 | 38 | 26| 8 | 1 | 200 
» Dog ... | — | 12 | 31 | 57 | 24 | 6 |—| 180 


Here y*=6:04 and P ='5378. Thus in more than half the trials we should 
obtain from homogeneous material pairs of samples more divergent than those for 
dog and goat. This confirms the view formerly expressed that as far as trypano- 
somes are concerned dog and goat are interchangeable. We cannot yet say that 
they are not interchangeable with the rat, as the mixture of strains in dog and 
goat and the uniqueness of strain in the rat may account for the marked 
divergence of the latter. Sir David Bruce and his colleagues do not appear to 
have noticed the wide divergence of the distribution of the rat from the dog and 
goat either as indicating the heterogeneity of the wild-game and the cattle strains 
of trypanosomes, or as suggesting such wide differentiation of strain by the host, that 
rat-material cannot be mixed with that from dog and goat. They do, however, 
remark of the wild-game strain: “In this the rat is not a suitable animal, since 
many strains of 7’. pecorum have no effect on itt.” This suggests that 7. pecorum 
is not homogeneous and that the rat exercises a selective influence on its strains, 
The suggested rejection of the rat data seems, however, to be based upon the in- 
convenience of its non-infectivity, and not on what might turn out to be of great 
importance a selective influence on wild-game or cattle strains. It is not possible 
to test this selective power in the present instance, as we do not actually know 
how heterogeneous either the cattle or wild-game material used really was. 


* R.S. Proc. Vol. 87, B, p. 7. 
+ R. S. Proc. Vol. 87, B, p. 7. 


= 
- 

| 


Kart PKARSON 907 


(e) If we turn to the 7. pecorum strain as actually found in the tsetse fly, we 
see that Sir David Bruce and his colleagues deal with these trypanosomes passed 
through a variety of animals, of which only goat and dog supply sufficient numbers 
for any even approximately accurate treatment. The data are as follows*: 


Microns. 


| 9 | 10| 11| 12| 13 | 14| 15 | 16 | "17 | 18 | Totals 


Wild G. morsitans strain: Goat | 1 | 3 | 12 | 21 55 60 | 32/12, 4 | —| 200 
Dog |— —| 3| 14| 34 41| 40] 19| 9 | —| 160 
| | | | 
G. morsitans strain: Rat | —| 22 | 28/19} 80 


For goat and dog we find y= 19°518, which give P= 0125. The resemblance 
is therefore far less than we have found for goat and dog in other strains, only 
once in 80 trials from homogeneous material would two samples of such divergent 
character arise. Before we comment on this it seems desirable to compare the 
very inadequate rat data. 


For rat and goat we have 

x? = 12201, P='1434. 
For rat and dog we have 

x?= 11370, P=1245. 


Accordingly we see that for this material the rat strain (i) lies between the 
dog and goat strains, and (ii) is definitely interchangeable with dog and with 
goat, while the dog and goai are much more divergent. Now the sparsity here of 
all the data must prevent any dogmatism; all we can reach is suggestion for 
further investigation. But the following points should be noted+. The trypano- 
somes through the goats were obtained from sia different goats, infected directly 
from the wild fly; the trypanosomes from the dogs were obtained from only four 
different sources, namely from a monkey directly infected by the wild fly, from a 
dog directly infected, and from two goats (89 and 125), the former only of which 
is identical with one of the former six goat sources, Lastly, the rats were infected 
from one dog alone, upon which the tsetse flies had directly fed. This dog is not 
identical with one of the dog sources. Now unless we assume that all the strains 
of the trypanosome found in the tsetse fly are identical—which is certainly not in 
accordance with the differences found in the strains of wild game from the “ fly- 
country ”—it is by no means certain that the trypanosomes obtained from wild 
G. morsitans, through goat, dog and rat as above noted came from anything like 
the same sources. Further, the closer resemblance between rat and dog strains 

* R. S. Proc. Vol. 87, B, p. 11. 


+ R. S. Proc. Vol. 87, B, pp. 10, 11, and 19 to 22, 
14—2 


108 A Study of Trypanosome Strains 


may simply be the result of the rat strain having been developed in the dog as 
host. The divergence between the dog and goat strain may again be solely due to 
the greater variety of sources in the goat. The data from the wild G. morsitans 
experiments seem to indicate that the observed divergences between the strain 
from rat and the strain from goat or dog may not be due to difference of host, 
but to difference of source from which the material was drawn, and to difference of 
treatment of the individual stock of trypanosomes, e.g. the number of hosts, etc., 
through which it has passed. 


It seems absolutely certain that at the present time most light would be 
thrown on the conditions for asserting sameness or diversity of strains, by well 
devised experiments on strains from single sources passed through different species 
of hosts in different manners, in order to determine the exact measure of divergence 
produced by host and by treatment, and ultimately to devise a standard treatment 
for all strains which we desire to compare. 


The exact nature not only of host, but of standard treatment is most vital. We 
can demonstrate the influence of treatment at once by considering the “ percentages 
of posterior nuclear forms among short and stumpy forms” recorded by Sir David 
Bruce and his colleagues for the wild-game strain*. All the trypanosomes were 
from rats, and although the date of infection of the rat is, I think, not stated, the 
dates of first extraction will be after much the same interval, and we can therefore 
classify by date from first extraction. We find the following table: 


Wild-Game Strains. 


Percentage of Posterior-Nuclear Forms among 
Short and Stumpy Forms. 


From first Extraction | 21 °/, and under 22°/, and over 


6 days and under... 18 24 
7 days and over pe 6 18 24 


Totals ke 24 24 48 


Using Sheppard’s formula for the four-fold table, we have for tetrachoric r 
r= "707; 

or, the correlation between this character of the trypanosome and the time 
after infection of extraction is very considerable. It will be obvious that in a 
standardised treatment this time of extraction will play a most important part. 
But it again is not independent of the species of trypanosome, for if we take the 
wild Glossina morsitans strains+, we find : 

* R. S. Proc. Vol. 86, pp. 396—404, Tables III, VI, IX, XII and XV. 

+ R. S. Proc. Vol. 86, B, pp. 410—418, Tables III, VI, IX, XII and XV. I have added one percentage 


by random selection from the complete table by lot in order to give 60 cases, and save labour in 
fractionising. 


i 


Karu PEARSON 109 


Percentage of Posterior-Nuclear Forms among 
Short and Stumpy Forms. 


From first Extraction 7 ya and under 8°/, and over Totals | 

6 days and under... 12 18 30 | 

7 days and over dies 18 12 30 
Totals vee 30 30 60 | 


leading to r = — ‘309. 


In other words using tsetse fly strains and not wild-game strains, but the same 
host, we find that now the correlation is negative or the longer the infection the 
smaller the percentage. Actually the five G. morsitans strains show remarkably 
irregular results compared with the results for the wild-game strains; the ex- 
tractions were spread over much the same period, 13 to 14 days on the average, 
but were somewhat more numerous for the G. morsitans. Thus even the same 
method of extraction may give widely varying results according to the nature 
of the strain producing the infection, although the host be the same. 


To the statistician who examines the frequency distributions provided by 
Sir David Bruce and his colleagues for both wild-game strains and Glossina 
morsitans strains, there can hardly remain a doubt about the heterogeneity of 
the material in each case. We have already demonstrated this statistically for 
the wild-game strains. These strains not only differ by immense differences 
inter se, but intra se they are clearly heterogeneous. Whether this heterogeneity 
is due to the mixture of separate strains, to dimorphism within the strain, or to 
the combination of material drawn from the rat at various stages of infection, it is 
not possible on the material at present available to determine finally. The same 
remarks apply with even greater certitude to the wild G. morsitans strains than to 
the wild-game strains. But we shall return to this point in the last section of this 
paper. We have already noted that Sir David Bruce and his colleagues identify— 
against the weight of the statistical evidence—the Mvera cattle strain, the wild-game 
strain and the wild G. morsitans strain as belonging to the same species 7’. pecorum*. 
They had previously identified other strains in wild game, G. morsitans and human 
beings+ with 7. rhodesiense which they elsewhere describe as vel brucei{. This is 
again, I hold, against the weight of statistical evidence. But it is not clear from 
the memoirs themselves what is the exact process by which an individual fly, an 
individual human being, or the blood from a specimen of wild game is credited 
with carrying a homogeneous strain. The sizes are so different in the cases of 
T. pecorum and T. simiae that there may be no difficulty in distinction, but the 
range is so great and to the statistician the material seems so heterogeneous in the 
case of 7. brucei vel rhodesiense that, perhaps, a fuller description by the authors 

* R. S. Proc. Vol. 87, B, p. 26. 


+ R. S. Proc. Vol. 86, B, p. 42. 
t R.S. Proc. Vol. 86, B, p. 426. 


110 


A Study of Trypanosome Strains 


of the process of differentiation would aid him. This is of especial importance 
if it should turn out, as I suspect, that the trypanosomes classed as 7’. brucei are © 
either dimorphic, or belong to two different species. 


In another paper* we find the trypanosomes from G. morsitans, on the basis of 
their infective powers on monkey, goat and dog, resolved into 7. brucei vel rhode- 
stense, T. pecorum, T. simiae and T. caprae. But it is clear that the differentiation 
was not done solely by infectivity, or there would have been no means of dis- 
tinguishing 7. brucei and T. pecorum which attack all three—monkey, dog and 
goat. The question arises, whether 7. pecorum, T. simiae and T. caprae being 
readily identified by microscopic examination or size, the remainder was classed as 
T. brucei, in which case the question of the heterogeneity of this group, which 
appears to attack all animals, is rather supported than otherwise by this paper. 


Frequencies of the Various Strains for Length. 
Length in Microns. 


(i) T. rhodesiense 


(iii) 7’. gambiense... 

(iv) Mzimba Strain 

(v) G. morsitans ... 
| (vi) Wild Game ... | 
| (vii) Human Strain 


| 
| 
| (ii) brucei 
| 


(viii) Chituluka 


9|10\11| 12 | 18 


16 
| 

178; 51) 5 — 

76, 93,126 92 


29° 35 67 54 
17| 40) 63 55 


Strain 


T. pecorum 
T. simiae 
T. cauprae 


(i) rhodesiense 
(ii) 7. brucei 


(iii) 7. gambiense 


(iv) Mzimba Strain 
(v) G. morsitans ... 
(vi) Wild Game ... 


(vii) Human Strain 


(viii) Chituluka 


30 | 31 


68] 9| 2 


61| 72; 50| 52] 28 
72| 38] 27); 26 
47| 44) 31) 20; 11 
24) 22; 16; 7} 4 
127| 133 96) 54 
125}110 62) 55) 33 
347 307 | 198 | 167 
111 | 128 99/117 


33 3h | 85 | 36| 37 | 


Length in Microns—(continued). 


| 1 
4| 4|— 
7\ 2|—|—| 
3 ies 
7/36/12/11| 2 
1 | 


17 | 18 | 19 


184| 143/115! 130/110 
285 | 200 | 162 149 135 
512 |525| 511, 464 425 | 


46| 56! 53) 98/120. 


| 


1 


56; 79/114 110 

8, 27, 79 175 139 

31 | 148 | 230 | 326 252 237 

8) 53) 118 | 252 348 

41 154| 325 494 577 

8| 48| 81| 78 44 
38 | 39 ‘Total 


| 2000 
500 


| 


| 


Rh. S. Proc. 87, B, p. 13 | 
Ibid. 85, 477 


Source 


500 | Lhid. 86, B, p. 278 


1000 | 
1000 
1000 
1000 
2500 
2500 | 
6220 
1500 


Ibid. 86, B, p. 291 


| 
| 
Ibid. 85, B, p. 227 | 
Ibid. 84, p. 331 | 


| Thid. 84, B, p. 350 
Tbid. 87, B, p. 31 
| Ibid. 86, B, p. 419 


| Ibid. 86, B, p. 405 
Ibid. 86, B p. 330 


* R, S, Proc. Vol. 86, B, p. 422. 


| Strain | 7; 15 | 20 21 |-22| 23 | 24 | 25 | 96 
T. pecorum | 2) 6 42 193 452 | 618 | 453 | 
| T.simiae ... =... |—|—|—| — | — | 7| | 
| 
1 | 3 | 19 92 51, 74) 68 59! 85) 
| mal —| 5] 8| 14 66 75) 87| 93 80| 82) 
| 9) 85| 85| 61| 47| 49| 
7] 
| 27 | 28 | 29 32 


Kart PEarson 111 
At any rate the exact method of differentiation adopted would be of interest 
to the statistician. The result of the paper is that the four species of trypanosomes 
occur in quite comparable permilles of tsetse flies caught in the sleeping sickness 
area of Nyasaland, and there is no evidence to show that they or other strains also 
may not occur side by side in the same fly or in the same specimen of wild game. 
Further, these compound strains would then appear in different proportions in the 
host. Some such hypothesis seems very needful to account for the extreme 
heterogeneity of the wild game, wild G. morsitans, and human strains as recorded 
by Sir David Bruce and his colleagues. The following table gives a comparison of 
what appear to be homogeneous strains—T. pecorum, T. simiae and T. caprae— 
with what appear statistically to be heterogeneous strains, ie. T. brucei, 
T. rhodesiense, T. gambiense, the Mzimba strain, the wild-game and wild G. 
morsitans strains of human type, and the human strains themselves. The table 


Means, Standard Deviations and Coefficients of Variation of eleven Strains 
of Trypanosomes. 


| 
Seri | Maan Standard Coefficient | 
betas Deviation of Variation | 
T. pecorum 13°992 + 019 1°2816+°014 9°16 + 099 
T. simiae 17°870 + 050 1°6558 + 035 9°27 
caprae 25°508+ 063 2°1011 +045 8°58 +°184 
(i) 7. rhodesiense ... 23°577 + 4°6764+°071 19°83 +°311 
(ii) 7. brucei 23°529+-094 | 4:3938+ 066 18°67 +291 
(iii) 7. gambiense 22°113+ 081 3°7867 + 057 17°12 + 266 
(iv) Mzimba Strain... | 21:4134-063 2-9586 + “045 13°82 + 
(v) morsitans 22°6954°058 | 4°3002+-041 18954187 | 
(vi) Wild Game | 22°6224-047 | 3°4541 +033 15274174 
(vii) Human Strain ... 23°796 + 035 | 4°1262 + 025 17°34+°108 
(viii) Chituluka | 26°172 + 084 | 4°8414+.060 


| 


above, gives the means, standard deviations and coefficients of variation of these 
strains. It will be seen that the first three are of a very different character to the 
last five. The variation of the latter is about double that of the admittedly pure 
strains, and throughout the whole course of our further work this possibility of 
heterogeneity, and the differential selection of the components by the host must 
be borne carefully in mind. Great divergences do not discourage the use of 
biometric methods, and we get occasionally identities of strains which are quite 
beyond the limits of chance coincidence and which point to definite possibilities if 
only host, environment, and treatment are once effectively standardised. I propose 
to try to throw some light on these points in the remaining sections of this paper. 


(6) On the Probability that Strains are alike after allowance for the Host. 


(a) Luckily in certain cases the treatment has been more or less alike. Thus 
in the wild Glossina morsitans strain, the tsetse flies brought to the Laboratory 


112 A Study of Trypanosome Strains 


from the “ fly-country” were in one strain (I) fed on a monkey and in the case of 
four other strains (II to IV) fed on dogs. From these animals thus infected others 
were inoculated, but in each case only the trypanosomes from a single rat were 
used for purposes of measurement and comparison. The following table gives the 
frequency distributions of the five strains, and chiefly on the basis of these 
distributions, Sir David Bruce and his colleagues conclude that: 

“The five wild Glossina morsitans strains resemble each other closely, and all 
belong to the same species of trypanosome.” (p. 421.) 


Wild G. morsitans Strains*. 


Microns 


15 16| 17 | 18 | 19 20 | 21 | 22 | 23 | 25 | 25 | 26 | 27 | 28 | 29 (30) 31 | 32 33 | 8h | 85 

Strain I J—| 3| 11 | 25 | 56 | 62 | 75 | 53 | 31 | 28 | 44 | 16 | 23 26 | 22 |12| 6| 4] 2 — 
| 1| 4| 20! 43 | 67 | 44 | 48 | 34/| 42 | 35 | 33 43 | 28 | 23/19/13 
]—|19| 72) 84 | 85 44| 24 40 | 33) 27 18 | 21] 12) | 3] 4) —|—|— 
IV} 5/37] 60] 71 | 54| 34| 25 | 17| 13 | 15 | 19 | 34 | 30 | 31/30) 1] 1 | — 
4/27 | 57 | 94 | 49 a7! 22/ 14) 13 11 | 19] 25 | 23 | 29 18/13 7/3) 2 
7 |31/ 148 230 | 326) 252 237 184/143 110| 127/133] 113) 96 54 7] 2 


Investigating the statistical measure of resemblance in the usual way we have 
the following series of results : 


Strains I and IT: x’ = 81°88, P < 000,000,1, 
Strains I and III: x’ = 139°71, P < 000,000,01, 
Strains I and IV: x’ = 100°15, P < :000,000,1, 
Strains I and V: x? = 115°77, P < :000,000,1, 
Strains II and III: x” = 328'12, P < 000,000,01, 
Strains II and IV: x? = 18488, P < :000,000,01, 
Strains II and V: x? = 208:79, P < :000,000,01, 
Strains III and IV: x? = 122°79, P < 000,000,1, 
Strains III and V: x? = 147-20, P < :000,000,1, 


Strains IV and V: x? = 23°90, P =:2470. 

Statistically therefore there is not the faintest resemblance whatever between 
any pair of these strains except the IV and V. These strains are for practical 
purposes interchangeable. In one out of every four trials two pairs of samples of 
500 from the same trypanosome population would give results more divergent than 
those observed. But what is the source of this resemblance? Why are these two 
strains alike and all the others widely divergent? There is nothing whatever in 
the paper to account for this agreement, and it is the more remarkable because 
Strains IV and V are to the statistician the most compound looking of all the 
strains. But some uniformity of origin or treatment has caused the two com- 
ponents to appear in like proportions, and at the back of this resemblance there is 
some vital point, if we could follow it up. Were the two dogs bitten by the same 

* R.S. Proc. Vol. 86, B, p. 409 et seq. 


Totals 
500 

500 

500 
500 

500 

| 2500 | 

d 


Karu PEARSON 113 


fly, or Rats 658 and 660 really inoculated from the same dog? Clearly there is a 
point here which ought to be cleared up, for otherwise the statistician could only 
conclude that the wild G. morsitans strains are widely divergent, and that their 
compound nature suggests that the tsetse fly carries various types of trypanosomes 
and these in varying proportions, 

(b) I now turn to the five human strains dealt with by Sir David Bruce and 
his colleagues. Let us first consider the human strains compounded from various 
animals. The following table gives the length distributions : 


Human Strains. A: Compounds from Various Animals*. 


Microns. 
14| 15 | 16 17 | 18 | 19 | 20 a1 | 22 23 | 2h 25 | 26 
Strain I, Mkanyanga ... J 1 4|19 | 42 | 63 81 75| 91 65 | 66| 93) 91/107 
— 2] 2/12) 55 108 159/210) 188/215} 177| 138) 83 
» ILI, Chituluka 1] 8 | 48| 81, 78 71 44 46 56) 53| 98| 120 
»  1V,Chipochola ...]— 2| 4 | 32 | 68 110 101|109/106 95 95) 74| 64! 
» Chibibi — | 1} 8 | 20) 58/117 | 122] 123 107| 93 98) 63) 51) 
Sun 1 | 10 | 41 | 154) 325| 494 528! 577 | 512| 525 | 511 | 464 425 | 
Human Strains. A: Compounds from Various Animals—(continued). 
Microns. 
27 | 28 | 29 | 30 | 31 | 82 | 38 | 34 | 35 36 | 87 38 | Totals 
Strain I, Mkanyanga 110/104) 49] 27/23 7]; 1] 1220 
,E 60| 34] 26/ 18] 8| 1|—|—J] 1500 
» ILI, Chituluka 111/ 128/138] 99/117] 91 | 68 | 27/11] 9| 1 | 1 | 1500 
IV, Chipochola ... 50| 38} 26) 16 5) 3 1 | — 1000 
»  V,Chibibi ...] 43] 30] 16] 10] 3|—|—|—]|—] 1 | —]} 1000 
Sum 372| 347 | 307 | 198 | 167 77 | 36 | 12|11| 2 | 1 | 6220 


We may compare the strains precisely as in the case of the wild G. morsitans 


strains, We find: 
Strains I and II: x? = 408°50, P =< ‘000,000,01, 
Strains I and III: x? = 204°99, P =< :000,000,01, 
Strains I and IV: x’? = 180°63, P =< :000,000,01, 
Strains I and V: x? = 20540, P =< ‘000,000,01, 
Strains II and III: x? = 923°62, P = < 000,000,001, 
Strains II and IV: x? = 79°01, P =< '000,000,5, 
Strains II and V: x* = 77°66, P =< 000,000,5, 
Strains III and IV: x? = 531°32, P = <:000,000,01, 
Strains IIT and V: x? = 563°82, P = <‘000,000,01, 
Strains IV and V: x? = 16°81, P =°7733. 


* R. S. Proc. Vol. 86, B, pp. 287, 291, 295, and 297. 


B, p. 423. 


Biometrika x 


For Strain I see R. S. Proc. Vol. 85, 


15 


| 
@ 
| | 


114 A Study of Trypanosome Strains 


Again we have the remarkable result that all the human strains are statis- 
tically divergent beyond any possible comparison, except those of Chipochola and 
Chibibi which show a high degree of correspondence. Now is this result the 
outcome of treatment? We note the following diversity of hosts : 


Strain I. Strain II. Strain III. | Strain IV. Strain V. 
Gross | Percentage | Gross | Percentage | Gross Percentage | Gross | Percentage | Gross | Percentage 
Men 60 4:9 00 oo | — 0-0 
Monkey 100 82 160 10°7 160 10°7 160 16°0 160 160 
Goat 20 16 60 4°0 80 SS 8-0 80 80 
Sheep 60 4°9 20 1°3 oo | — 0-0 0-0 
Dog 260 21°3 260 17°3 260 17°3 260 26°0 260 26°0 
Guinea Pig 120 9°8 0-0 0-0 
Rat - 600 49°2 1000 66°7. | 1000 66°7 | 500 50°0 500 50°0 
| - 
| Totals... | 1220 1500 1500 | 1000 1000 


Now it will be clear at once that the percentages of trypanosomes drawn from 
various types of host are identical only in the case of Strains IV and V, which we 
have found in close accordance. But there is not great divergence in source 
between Strains IT and III although Strain I shows fairly wide differences. We 
find, however, that II and III are statistically very unlike, the next closest 
resemblances, although very slight, being between II and IV and V. It would 
not seem therefore that the degree of similarity is wholly determined by similarity 
of hosts. I have accordingly reinvestigated the five human strains by taking rats 
only. But, of course, even then it is of vital importance to be certain that the 
process of transfer from man to rat was the same in all five cases, and of this no 
evidence is provided. 


Human Strains. B: From Rat only*. 


Microns. 
| 17 | 18 | 19 | 20 | 21 22 | 23 | 24 | 25 | 26 a7 | 
| 

| Strain I, Mkanyanga 1 | 1/21 | 40 | 52 | 49 | 30 | 31 | 36 | 33 | 48 | 52 

» E, Rat 728 | —| 4|15| 30| 57] 72] 85 | 72 | 59 | 44 | 96 
| ” 41 E Rat 726.. —|—! 2/24] 30 2 60 | 61 | 87 | 73 | 55 | 27 | 20 
| ” TIT, Chituluka, Rat 952 | 1 27 | 28/15 | 10 | 15 | 19 | 21 | 34 | 44 | 36 
Chituluka, Rat 953 | 1/17. 26/ 20/19/15 | 14| 26 | 18 | 33 | 40 | 34 
| IV, Chipochola, Rat 1337] —|— | 4) 6| 16 | 29| 53 | 61 | 59 | 69 | 56 | 51 | 36 
| 2 'V,’ Chibibi, Rat 1660... | —- |- —| 4/17 29| 46 | 63 | 69 | 73 | 52 | 40 | 31 
| | 290 376] 362 | 32 2| 204 235 


* R. S. Proc. Vol. 86, B, pp. 288, 289, 292, 293, 295, and 298. For Strain I see R. S. Proc. 
Vol, 85, B, p. 423, 


: 


PEARSON 115 
Human Strains. B: From Rat only—(continued). 


Microns. 
28 | 29 | 30 | a1 32 | 33 | 34 | 35 | 36 | 37 | 38 | Totals 
Strain I, Mkanyanga | 87 | 57 | 35 | 21] 18] 11 | 6| 1 | —| —]- 600 
» Il, E,Rat726... 8| 4] 2] so 
» III, Chituluka, Rat 952 | 41 | 48 | 28 | 43 | 27| 23/10) 4) 5/1 > 14 500 
» ILI, Chituluka, Rat 953 | 37 | 47 | 37 | 33 | 41 | 23} 12| 4 | 3. | — | —] 500 
Rat 1337] 31 | 14 | 13 | 2| — 
»  V, Chibibi, Rat 1660... 33 | 25/11 | 6| 1|—|—|—|—|— —] 500 
{219 | 210 | 134] 108 ss | 57 | 28 | 9 | 8 | 1 | 1 | 3600 


This table with its two pairs of rats inoculated from the same strains is 
peculiarly instructive. We can compare II, Rat 726, with II, Rat 728. 


We find: x’ = 36195, giving P =-0048. 


This is far from the high degree of divergence we have found between the com- 
pound human strains, but it is not satisfactory as a measure of the agreement of 
the same strain in two hosts of the same species. 


Applying the same test to the two Rats 952 and 953 of Strain III we have: 
x? = 14715, giving P =-9038. 


This is, of course, quite satisfactory. We should not hesitate to assert identity 
of strains and of treatment in the case of the trypanosomes from these two rats. 
The statistician will feel fairly confident that there is a factor of divergence 
between the trypanosomes of the two rats in Strain II, which does not occur in 
the two rats of Strain III. He will be almost certain that the strain was not 
conveyed through the same steps or at the same stage of the disease to the rats in 
Strain II. Unfortunately dates and processes are not discussed. Sir David Bruce 
and his colleagues say that it is remarkable how much alike these distributions for 
Rats 726 and 728 are, and again for the distributions for Rats 952 and 953 that 
they also closely resemble each other. “It is curious and striking that the same 
strain of trypanosome growing in two different animals should show this remarkable 
similarity*.” The interesting point is that the statistician would agree with the 
remarkable similarity in the latter case, but the divergence not the remarkable 
resemblance in the first case would force him to seek for some explanation in 
treatment. It will, I think, be clear from these illustrations that a strain of 
trypanosomes, even if obviously compound, can be taken from a single source and 
after inoculation into two different individuals of the same species be identified 
as same; but to insure this result on every repetition the greatest caution will 
have to be exercised as to identity of process and treatment. 


* R.S. Proc. Vol. 86, B, pp. 289 and 293. 
15—2 


116 A Study of Trypanosome Strains 


There are still further results of importance to be ascertained, however, from 
our table of human strains. Let us compare Strains IV and V, which we found 
resembled each other closely even for compounded hosts. We now reach 


14035 and P= 5229. 


Or, the probability that these two strains are identical has been reduced by 
selecting out the rat data only. But the result is still so high that no one would 
hesitate to assert that Chipochola and Chibibi were suffering from a disease due 
to the same strain of trypanosome. The correspondence is so close that we have 
combined Strains III and V for all other comparisons. In the case of Strain ITI, 
we have added together the results for Rats 952 and 953. Such addition is less 
reasonable for Rats 726 and 728, but without doing this, it is impossible to decide 
which rat is to represent the E strain. I have then made the following com- 
parisons : 
Strains IV and V with III: y*?=525°67, P< -000,000,01. 

There is accordingly no similarity at all between the Chituluka strain and that 

common to Chipochola and Chibibi. 


Strains IV and V with II: y*= 64°70, P <:000,001. 


Thus the strain from the European E from Portuguese East Africa diverges 
from the Nyasaland strain widely, but not as widely as that of Chituluka does 
from those of Chipochola and Chibibi. 


Strain I with Strain HI: x? = 126-13, P <:000,000,1, 
Strain I with IV and V: y*=217'82, P < :000,000,01. 


Thus the trypanosomes from Mkanyanga are widely divergent from those of 
the three other Nyasaland cases. Nor are they any closer to the European E: 


Strain I with Strain II: y? = 331°37, P < :000,000,01. 


Thus with the exception of the Chipochola and Chibibi strains, the try panosome 
distributions from human sources differ widely. Nor is this to be wondered at, if 
the human beings owe their trypanosomes to Glossina morsitans, for in that case 
we should expect the human strains to be as diverse as we have found those from 
the tsetse fly itself. It would remain to explain the close similarity of the 
Chipochola and Chibibi cases. It would be interesting to know the history of 
these cases with regard to locality and to the possibility of a unique source 
of infection. 


(c) In the case last dealt with, namely that of Chipochola and Chibibi, we 
have the remarkable feature that the strains although significantly identical, 
whether treated in the rat alone or in compounded distributions from various hosts, 
resemble each other somewhat less closely in the single host series. This is not 
generally the rule. Some of the big divergencies we have already noticed become 
far less appreciable, nay, even become resemblances when we confine our attention 
to one species of host. The chief misfortune which then too often arises is the 


Kari PEARSON 117 


paucity of the total numbers that we have at our disposal. I will consider, 
however, from this aspect the relations of the three strains wild G. morsitans, 
wild game, and Mvera cattle. 


I compare first the lengths of 200 trypanosomes from wild G. morsitans and 
wild-game strains. These yield for the host, goat* : 


Microns. 
From Gost 9 | 16 | 17 | 18 | 19 | Totals 
Wild G. morsitans Strain | 1 | 3 | 12 | 21 | 55 | 60: 4 — | 
Wild-Game Strain 1 | 16 | 37 | 73 | 38 | 26) 8 | 1 |—| 200 
| 
giving : x? = 26782 and P=-0015. 
To further test this, I take the same two strains in the dog as hostt : 
Microns. 
From Dog 9 | 10 | 11 | 12) 13 | 14 15 | 16 | 17 | 18 
Wild G@. morsitans Strain | — | — | 3 | 14 | 34 | 41 | 40 | 19 Ree 160 
Wild-Game Strain | | | 31 | 57 | 50 | 24 | 6 | 
Here ¥=7:045 and P="3171. 


The value we had previously found for a mixture of all strains was P = 0002. 
Thus the two strains may be considered as identical when we deal with the 
trypanosomes from the dog, as showing considerable divergence when we take the 
goat, and as showing marked divergence when we take a great variety of hosts. 
The weight of evidence in favour of a standardised treatment thus becomes very 
great. 


Let us look at precisely the same material for the wild-game strain and for 
the Mvera cattle strain, first for the goat and then for the dog as host}. The 
grave difficulty is the paucity of measurements thus differentiated. 


Microns. 
| From Goat 9 \10\ 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | Totals 
Wild-Game Strain ...|— | —J| 1 | 16/| 37|73|38/26| 8 | 1 | 200 | 
Mvera Cattle Strain .. | 1 1 3 | | | 100 | 
| 


This gives x’ = 14670, leading to P =:1013. 


* R. S. Proc. Vol. 87, B, pp. 6 and 11. 
+ R. S. Proc. Vol. 87, B, pp. 6 and 11. 
t R. S. Proc. Vol. 87, B, pp. 3 and 5. 


118 A Study of Trypanosome Strains 


Microns. 
| 
From Dog | 11 | 12 | 13 | 14 | 15 | 16 | 17 ‘neon 
| Wild-Game Strain 31/57] 50| 24! 6 | 180 
3 |11 197) 30/21} 8|—! 100 


| 
This leads to 15992, P=-0138. 
Previously (p. 98) on the total series of different hosts we had found 
P= 000,243. Thus by referring our material to individual hosts, we have reduced 
the degree of divergency between the wild-game and Mvera cattle strains, but it 
would be still hazardous to state that these strains are identical. 
Lastly, we turn to the Mvera cattle strain and the wild G. morsitans strain 
dealing with dog and goat as hosts separately * : 


| Mvera Cattle Strain 


From Goat 9 | 10} 11 | 12| 13} 14) 15 | 16 | 17 | Totals 


| 
| 


| Wild morsitans Strain ...| 1 | 3 | 12| 21) 55 | 60] 32 |12| 4 | 200 
Mvera Cattle Strain 1 3/14 2} 19 | 13; 1 100 


This gives x’ = 7968, P = 4368, 
And again : 


Microns. 


From Dog | 9 | 10) 11 | 12 | 13 | 14 | 15 | 16 | 17 | Totals 
| 
Wild G. morsitans Strain... | | —|3]14 34] 41/40/19) 160 
Mvera Cattle Strain .~{/—!—!]3 | 11 | 27} 30| 21' 8} —! 100 
| | | | | 
resulting in x? = 11:120, P = ‘0852. 


The Mvera cattle strain and the Glossina morsitans strain had for all hosts a 
divergence measured by P= ‘000,008. Thus the great bulk of this divergence 
is due to multiplicity of hosts f. 


To sum up the results obtained for 7. pecorum in Mvera cattle, wild G. morsitans 
and wild-game strains, the identification of these strains was quite illegitimate on 
the basis of the compound host frequencies. It is reasonable on the basis of 

* R. S. Proc. Vol. 87, B, pp. 3, 10—11. 

+ It is worthy of note that in comparisons with the cattle strain the goat appears to give closer 
results than the dog, but the dog appears the beiter in the comparison of the G. morsitans and wild- 
game strains, 


Kart PEARSON 119 


trypanosomes taken from a single species of host. But how far the resemblance 
in these cases is produced by a selective influence of the host and not necessarily 


by an identity of all the members of the strain before transference to the host is 
not demonstrated. 


On the other hand while divergence due to host will account for the divergences 
which are so notable in 7. pecorum, it will not account for the divergences in the 
human strains; these are startlingly conspicuous even if we confine our attention 
to a single species of host. Precisely the same remarks apply to the trypanosomes 
similar to those causing disease in human beings found in wild game and in the 
tsetse fly itself. There must be another source for these divergences. 


(7) Discussion of the Heterogeneity which is statistically demonstrable in the 
bulk of the Trypanosome Measurements. 


The reader who has attentively followed the course of the argument in the 
previous sections will be prepared for the next step in this memoir, the attempt to 
account for the large divergences between strains of trypanosomes in individuals 
of the same species by the heterogeneity of those strains. My suggestion is that 
the strain in one fly differs from that in another because the components do not 
appear in the same proportion, the strain in one specimen of wild game from that 
in another, or in one man from that in another because they have been bitten by 
a fly containing the components in unlike proportions. The host does make some 
difference, either by nutrition or selection of trypanosomes, but it is a minor differ- 
ence. Thus consider what we may probably hold to be pure strains and observe 
the average differences in length found by Sir David Bruce and his colleagues: 


Microns. 
| 
T. pecorum 
T. simiae* T. caprae | 
Mvera Cattle + Wild G. morsitans § 
Goat 17°3 | Waterbuck 26°8 | Donkey 13°5 | Goat 13°5 
Monkey 18°1 | Ox 25°7 | Ox 14:2 | Monkey 13°6 
— Goat 25°3 | Goat 13°8 | Dog 14°2 
— Sheep 25°6 | Dog 13°8 | Guinea Pig 14°6 
Rat 14°8 | Rat 
| | 
Max. Difference 0°8 Max. Difference 1°5 | Max. Difference 1°3 | Max. Difference 1:1 | 


We may thus anticipate that in a pure strain the change of host would hardly 
make a difference of more than 2 microns in the average length. We must 


* R. S. Proc. Vol. 85, B, p. 479. 


. S. Proc. Vol. 87, B, p. 3. 
+ R. S. Proc. Vol. 86, B, p. 279. 8. 


p. 
Proc, Vol, 87, B, p. 10, 


| 
| 


120 A Study of Trypanosome Strains 


accordingly be prepared for some such change as this in the shifting of the mean 
when the host is varied. 


We have next to inquire what type of curve accurately describes the strains 
which we are fairly certain are homogeneous. 


If the reader will turn back to p. 110 he will note at once a marked difference 
between the distributions for 7. caprae, T. pecorum and T. simiae when compared 
with those entitled Mzimba strain, human strain, wild-game strain, 7. brucei, 
T. rhodesiense, T. gambiense and the wild G. morsitans strain. The coefficients of 
variation of the former group are all under 9°5 (mean = 9:00), the coefficients of 
variation of the latter group are all over 13°5 (mean = 17:29). We recognise 
therefore a totally different order of variability. Even in absolute variation as 
measured by the standard deviations we find the first group with its mean 
S. D.=1°68 and the second with its mean 3°96. An examination of the graphs 
scattered through the trypanosome papers to which we have referred will, we think, 
convince the statistician that we have to deal with heterogeneous and not skew 
homogeneous material*. It becomes of course important to ascertain whether in 
the pure strains a Gaussian curve will suffice to describe the frequency closely 
enough for statistical purposes, for, if it does, the analysis into at any rate two 
Gaussian components of the heterogeneous strains becomes relatively direct, if 
laborious. I will consider the 7. pecorum, T. simiae, and T. caprae strains 
from this standpoint. 


(a) T. pecorum (see p. 110). 
Mean = 13992 microns. §.D.=1'2816 microns. 


Observed Calculated 
Values | Values 
| 
9 and under 2 0°46 
6 | 5°98 
11 42 45°41 x°=7°630 
12 193 192°52 P= ‘572 
13 452 456°70 
14 618 607 °12 
15 453 452°49 
16 | 178 188-98 
17 51 44°16 
18 and over 5 6°20 
| 


Hence in 57 out of 100 trials from material following the Gaussian distribution 
a more divergent sample than that observed would actually be obtained. We can 
therefore conclude that a simple Gaussian frequency adequately describes the 
distribution in size of 7. pecorum. This is illustrated in Diagram II. 

* Note especially the bimodal graphs in R. S. Proc. Vol. 83, B, pp. 5 and 11, for both the Uganda 


and Zululand strains of 7. brucei, in Vol. 86, B, pp. 291—293, for human strains, in Vol. 86, B, 
pp. 395, 397 for wild-game strains and pp. 409, 411, 417 and 419 for G. morsitans strains, 


1 | 

q 

A 

] 


Kart PErarson 121 


Total 2000. 


Frequencies per Micron. 


9 10 ll 12 18 14 15 16 17 16 10 
Microns. 
Diacram II. Gaussian fitted to T. pecorum Frequency. 


(b) T. simiae (see p. 110). 
Mean = 17°870 microns. S.D.=1°6558 microns. 


| Observed Calculated | 
| Values Values | 
| 
14 and under | 7 10°46 
15 28 27°63 
| 16 76 63-92 x2=8°149 
17 93 103°78 P= ‘520 
18 | 126 118°32 
| 19 92 94°66 
20 47 | 53°18 
21 | 22 20°96 
| 22 6 5°80 
| 23 and over | 3 1°29 | 


150 
140- 
130F 


110+ 
100+ 


Total 500. 
fo) 
T 


10+ = 


is 4 16 10, 18 19 °20 21 22 23 24 


Microns. 


Dracram III. Gaussian fitted to T. simiae Frequency. 
Biometrika x 16 


Frequencies per Micron. 
T 


800 
i 
/ lg | \ | 
300 = 
A \ 
100 
/ \ 
\ 
| 
| 


122 A Study of Trypunosome Strains 


We conclude that the Gaussian adequately describes the distribution of 
T. simiae. In more than half the trials we should get a worse sample. See for 
graphical fit, Diagram III. 


(c) T. caprae (see p. 110). 
Mean = 25°508 microns. §S.D.=2°1011 microns. 


Observed Calculated 
Microns Values Values 
20 and under 4 4:28 
21 8 982 
22 23 23:95 
23 49 46°74 | x2=5°175 
24 79 73°05 = ‘921 
25 95 
26 80 
| 68 73°45 
28 57 47°16 
29 24 | 24°26 
80 9 9°98 
31 and over 4 4°38 
This is a still more excellent fit; if the Gaussian represented the population, | 
in 92 °/, of sampies we should get a more divergent sample than that observed. 


The curve is given in Diagram IV. 


150 
140+ 
130+ 
120 
3 > 
100+ 
90r 4 
sot 
= 70; 
4 
60 / 
3 50r 
20+ vg 4 
4 
Microns. 


Diacram IY. Gaussian fitted to T. caprae Frequency. 


It will be clear from the above three illustrations of what we may term 
homogeneous trypanosome strains that the Gaussian curve of frequency suffices 
to describe adequately such material. It is equally clear that no Gaussian can 


3 


PEARSON 123 


possibly describe such skew distributions as we get in the wild-game strain or wild 
tsetse fly strain of the trypanosome species identified by Sir David Bruce and 
colleagues as 7. rhodesiense*. It is equally impossible in the case of the human 
strains figured in the paper of February 1913+. I illustrate this on the frequency 
distribution for 6220 trypanosomes of human strains?. 


Observed Calculated | Observed | Calculated 
| 
14 and under 1 75°45 26 425 520°55 
15 10 62°51 | ar | 372 444-42 
16 28 347 357-96 
1 154 | 15550 | 29 | 307 271°88 
18 325 | 294-73 «| 30 | 198 194-81 
19 494 306-27 31 | 167 131°68 
20 528 39373 32 83°91 
21 577 477°39 33 50°44 
22 512 545-93 3 28°61 
33 525 58891 35 15°30 
511 99°17 36 and over 14°18 
25 464 | 575°04 | 


Here xy? = 501 and P< ‘000,000,001. In other words description by a Gaussian 
is absolutely impossible. The histogram of observations and the curve are shewn 
on Diagram V. 


Now the suggestion that flowed at once from these results was the compound 
nature of all the material classed under the headings: 
(i) T. rhodesiense. 
(ii) brucei. 
(ii) gambiense. 
(iv) Mzimba Strain. 
(v) Wild G. morsitans Strain. 
(vi) Wild-Game Strain. 
(vii) Human Strain. 
With the experience of the Gaussian fitting the homogeneous strains, the direct 
step was to investigate whether the above material could be analysed into two 


Gaussian components and to determine how nearly these components were in 
agreement. The method of carrying out this analysis was provided in the first 


of my series of Contributions to the Mathematical Theory of Evolution§. There 


was nothing to prevent the process being applied to every individual frequency 
given by the trypanosome workers, except the very laborious arithmetic. The 
method was applied to the above seven cases, and also (viii) for the purposes of 
illustration to a single human case, that of Chituluka, a native of Nyasaland, who 

* See R. S. Proc. Vol, 86, B, pp. 407 and 419. 

+ See R. S. Proc. Vol. 86, B, pp. 285 et seq. 

t+ See R. S. Proc. Vol. 86, B, p. 300. 

§ Phil. Trans, Vol. 185, A, pp. 71—110, 1894. 

16—2 


A Study of Trypanosome Strains 


124 


usuiny jo uoynqiystq 19 0} jo “A 


ge GE VE SE LE OF 6B BS LB 9B GB VS EB 1G OJ BI LL OL Gl Et tt OL 
| T 


L 


uray 


00s 


009 


| 

| 

| 


Kart PEARSON 125 


died of sleeping sickness*. With the single exception of 7. brucei every one of 

these distributions broke up into two components, and into two components with 

strikingly close means. I propose to call these two components 7. minus and . 
T. majus. I do not assert that they are distinct species; they may be dimorphic 

groups of one and the same trypanosome species. But the recognition of their 

existence seems to bring some order at least into the chaos we have already noted 

as existing in the trypanosome measurements. Two human strains or two wild- 
game strains differ from each other with such wide divergence in their frequencies z 
because these two groups 7. minus and 7. majus are mixed in the individual 

in different proportions. 


Standard Coefficients of Size of 
Means Deviations Variation pe 
T. minus | T. majus | T. minus | T. majus | T, minus | T. majus| T. minus T. majus 
T. rhodesiense 18°7418 | 26-1122 | 2°3184 | 3-4397 | 12-370 | 13-173 656-05, = 
T. brucei... [ 19-se44 26°1122  2°6439 | 3-4134 | 13-337 | 13-072 
T. gambiense —... 19°8926 | 26-2463 | 2-0566 | 2-6260 | 10-339 | 10-005 7 
... | 19°8966 | 24-0508 | 13961 | 3°1028 
| Maimbe Strain 9°8966 | 24-050; | 396 | TOIT | 12-901 |) 983°), 
| 
G. morsitans Strain | 19°6475 | 27-1966 | 1-7503 | 2-7013 | 8908 | 9-932 
| Wild-Game Strain | 204418 | 25°8263 | 1-6332 | 2°8799 | 7-990 | 11-151 aet wee | 
| 59°5°/, 40°5°/, 
Human Strain... | 20°3687 | 26-2930 | 1-9444 | 34470 | | 13-110 tere? eres 
Chituluka ... ....| 19°8410| 28-7875 | 1-9785 | 2:8823 | 9-972 | 10-012 
| | 
Means... ... | 19°8315 | 25-9542 | 1-8498 | 3-0328 | 9-360 | 11-7120 — | 
| 
... ...| 17670 | — | 16558 | |} 
Tcaprae... ..| — | 25508 | — — 8580 | — 100°/ 
| ° 


The table below gives the chief biometric characters of 7’. minus and 7. majus 
as found from the seven resolutions. The mean values of the constants for 
T. minus and for T. majus are placed at the foot ; in calculating these mean values, 
Chituluka’s data have been excluded as already included in the human strain, and 
also those for 7’. brucei not directly resolved. 

At the foot of the table I have placed the constants for 7. simiae and T. caprae, 
the nearest pure strains to 7. minus and T. majus respectively. I do not in the 


* RS. Proc. Vol. 86, B, p. 291, 


126 A Study of Trypanosome Strains 


least suggest there is any identity, but comparison may bring home to the 
trypanosome worker the average sizes of the two components*. The differences 
of the variabilities are, however, much larger, and the influence of host on 
variability as well as on mean ought to be studied. 

It will be seen at once that the divergence in the individual means of 7. minus 
from the general mean is very slight, at most a micron, and well within the limits 
which arise, as we have seen, from difference of host. It is a most remarkable fact 
that from six independent reductions the mean size of J. minus should come 
out so nearly 19°8 microns. In 7. majus the correspondence is not so good; the 
average of about 26 microns falls to 24 in the Mzimba strain and rises to 28°8 in 
the case of Chitulukat. Still it does not appear to me that these changes of 
mean of the 7’. majus component are absolutely beyond the variation due to differ- 
ences of host and treatment. Another more serious matter is the comparatively 
wide range found for the variabilities ; but even here it is impossible to assert that 
such differences will not occur with difference of host. For example the Mvera 
cattle strain, a fair sample of the simple 7. pecorum, gives: 


| Host | aici Standard Coefficients of 
| Deviation Variation 

| 

| Goat ... 13°80 1°462 10°592 

Hat 14°75 *839 5°689 

ses 13°79 1°087 7°885 


Here while the means are within one micron, the differences in variability are 
of the same order as those found in 7. majus from different hosts. 

Again, taking a pure homogeneous strain as 7’. caprae with goat and sheep as 
host, which are scarcely so differentiated as man and antelope, we find: 


| Standard Coefficients of 
Host Mean | Deviation Variation 
Goat 25°31 | 2-187 8°642 
Sheep... 25°60 1°923 
| 


Lastly, taking 7’. simiae for goat and monkey we have: 


Standard Coefficients of 
Host Mean | Deviation Variation | 
| Monkey... | 17:26 8°127 
| Goat ... | 1811 | 9°315 


* The maximum average length of 7’. caprae is 26°8 in the waterbuck and of T. simiae 18:1. 
+ It should be noted that with the whole of the human data the mean is 26°33 and that Chituluka’s 
mean is very exceptional. 


| 
| | 
| 
4 
| 
__ 
if 
i 


Kart PEARSON 127 


I think we may conclude that, allowing for the errors of random sampling 
and the errors arising from the resolving process, the deviations observed in the 
variability of our two components do not invalidate the hypotheses : 


(i) That the widely divergent results obtained from different strains are due 
to the existence in the same individual of two types of trypanosome with very 
varying percentages from individual to individual. 


(ii) That one of these types has a mean length of about 19°8 microns and a 
variability of about 1°8 microns, the other a mean of about 26°0 microns and 
a variability of about 3:0 microns. The means may vary 1 or 2 microns with the 
nature of the host and the variability 0°5 to 1 micron. 


The large type predominates in the Nyasaland human strains*, on the average 
in about the ratio of 3 to 2, but the smaller type predominates in the G. morsitans 
and wild-game strains in about the same ratio; while in the trypanosomes classed 
as T. rhodesiense, and T. gambiense as well as in the strain from the Mzimba 
donkey the preponderance is still of the smaller type and the ratio approaches 
13 to 7. Whether these ratios are peculiar to the host or due to the infecting fly, 
it is not at present possible to determine. But the hypothesis of the existence of 
these two types,—whether as a dimorphism of 7’. rhodesiense or as independent 


species seems to bring some order into the apparent chaos of recent trypanosome 
measurements. 


The following paragraphs give the calculated constants of the reductions, and 
the numbers of the diagrains showing the nature of the compound frequencies: 
(i) T. rhodesiense. 
Mean = 23°577, 
= 21°86874, = 1079°10255, 
bs = + 401986. Ms = + 1105°74834. 
Reducing nonic: 
249° — 298-7232q' —5817q° + 1114°7684¢q° + 
— + 12°9808q? + 0891q + 0001 = 0 
wheret p.=— 10g. 
The root is p,=—12'2578. This leads to the two components in the Table 


p. 125. The histogram of the observations and the two component Gaussian 
curves with their compound are given in Diagram VI. 


The resolution is not a very good one; for 24 groups y? = 37°48, and P = 05, or 
once in 20 trials only we should get a worse result. But an examination of either 
the graph or the original frequency shows at once the cause of this divergence. 
In their measurements Drs Stephens and Fantham have had a strange bias in favour 

* The European from Portuguese East Africa had predominance of 7. minus. See R. S. Proc. 
Vol. 86, B, p. 288. 
+ Notation of the memoir Phil. Trans. Vol. 185, A, p. 84, Eqn. (29). 


Calcula 


Observe 


| 


‘snfow “J, oyut asuarsapoys yo Aouonbergq oy} Jo 


6€ LE 9E GE VE EE ZE IS OF 6S BG LZ 9B GB ES IZ OS GI Bi AL OL GI vi St Lt OL 


-|OL 
q 
& 1 
-S ‘ 
~ 
4 
= 
7 
4 
_ 


128 


L 
x. 


ool 


1 


Karu Pxarson 129 


of even numbers. No curve whatever could fit the data satisfactorily under the 
circumstances! Either they used a scale graduated to 2 microns only, and had a 
prejudice in favour of the scale markings, or else their even numbers were in some 
way more conspicuous than their odd. Whatever the source of this peculiarity 
may be, there can be no doubt of the bias*. 


The only way to obtain a reasonable measure of the goodness of fit in Stephens 
and Fantham’s results for 7. rhodesiense is to group from 10 to 12, 12 to 14 and so 
on in cenparing the observed and calculated frequencies. If this be done we find 


x’°= 5°03 for 13 groups and P=-957, a splendid fit. The frequencies are as 
follows : 


| | 10-14 | 14-16 | 16-18 | 18-20 | 20-22 | 22-24 | 24-26 | 26-28 | 28-30 | 30-32 | 32-34 | 34-36 | 36-38 | Totals 
Observed 9 38°5 | 83-0 | 133°5 | 134°0 | 127°0 | 135°5 | 139°5 | 112°0 | 60° | 22°0 | 40 15 | 1000 
Calculated | 7°17 | 34°67 | 92°99 | 132-91 | 12479 | 124-28 | 146-35 145°56 | 106°55 | 56°22 | 21°36 | 5°84 | 1°17 | 999°85 


| 
| 


(ii) T. brucei. The data for this trypanosome were taken from Sir David 
Bruce and colleagues’ diagram+. I have not come across the original publication 
with the measurements involved in this diagram. Describing this species in 
July 1910}, the authors speak of its well-marked dimorphism. This is very 
obvious in the graphs for length given for the Uganda 1909 and Zululand 1894 
strains, but the numbers given are far too slender (160 and 200 respectively) to 
justify any attempt at analytical resolution. Graphically we may take it that 
roughly the following are the means of the components: 


T. minus. T. majus. 
Uganda 1909 20 microns 28 microns 
Zululand 1894 18 microns 29 microns. 


These are not very widely divergent from the values 


19°8 microns 26°0 microns 
we have found from the seven resolutions. 


In May 1911§ the two curves for Uganda and Zululand appear to be added 
together to give a 7’. brucei curve of length distribution. This is again markedly 
bimodal with one component mean at 18°75 microns and the other at 27°5 microns, 
both approximative. Thus far 7. brucei appears quite well to fit in with our other 
material. But in September 1911 appears the diagram of 7. brucei said to be 


* Bias of this or of a similar character is not uncommon—even in the pages of. this Journal. 
I remember once pointing out to a Scotch anthropometer his prejudice in favour of whole centi- 
metres. He looked at his results, recognised the bias, and then gravely told me that it was not 
due to any personal bias, but that the Creator must have designed Scotsmen on the metric scale! 

+ R. S. Proc. Vol. 84, B, p. 331. 

t+ R. S. Proc. Vol. 83, B, p. 2. 

§ R. S. Proc. Vol. 84, B, p. 186. 

Biometrika x 17 


130 A Study of Trypanosome Strains 


based on 1000 individuals. Here there is a mode about 240, with possibly a sub- 
mode at 19 microns, but the evidence for dimorphism has largely disappeared. 
It is very desirable that we should know the details of this curve, i.e. the nature 
of the hosts aud so forth, for it apparently replaces the earlier data and remains 
the standard 7. brucei distribution. It certainly shows nothing of the definite 
heterogeneity (or dimorphism) of the previous Uganda material. 


Its constants are as follows : 
Mean 23°5290, 
= 19°30583, = 996°87764, 
= 10°54837, = 2146°37930. 
— 101°8618q" — 400574" + 140°6937q° + 6208354" 
— 29°39409? + + + 0331 = 0. 

No suitable root of this equation existe and accordingly it would appear that 
this distribution is not rigidly reducible to Gaussian components. This result is 
so remarkable in view of the obviously bi-modal character of the earlier 7. brucei 
distribution, and the resolution into two components of all the other seven 
distributions, said to be allied to 7. brucei, that I determined to consider the 
matter further by fitting Gaussians to the ‘tails’ of the 7. brucei distribution*. 
I chose as the right-hand ‘tail’ the frequency from 28 to 38 inclusive, and as the 


left-hand ‘tail’ the frequency from 13 to 18 microns inclusive. The two resulting 
components were: 


T. minus. T. majus. 
m, = 20°0817 (19°83), M, = 26°4359 (25°95), 
o, = 2'8685 (1°85), o, = 3°6399 (3-03), 
n, = 62816, = 467-52. 


The totals populations for each component are clearly not very good and their 
combination exceeds by 9°6 °/, the total observed population; but the means are 
not widely divergent from the average values resulting from our six resolutions, as 
the numbers given in brackets testify. Accordingly I determined to select the 
means of the components at values near the mean values of six reductions, and 
after one or two slight betterments, determine the sizes of the pcpulations and 
their standard deviations so as to give the mean, and second and third moments of 
the observed population, These provided: 


T. minus. majus. 
= 19°8244, Nie = 261122, 
o, = 26439, o, = 3°4134, 
n, = 410°83, = 589°17. 


* Biometrika, Vol. u. p. 1 and Vol. v1. p. 65. 


| 


Kart PEARSON 131 


The following table gives the observed and calculated values : 


Microns Observed Calculated Microns Observed Calculated 
13 5 3°44 2¢ 82 72°74 
14 8 5°80 27 72 67°98 
15 14 12°25 28 50 59°71 
16 17 22°79 29 38 48°10 
17 40 37°05 30 27 36°04 
18 63 52°87 31 26 24°79 
19 55 66°68 32 18 15°67 
20 66 75°44 83 11 9-09 
21 63 78°43 34 4 4°84 
22 75 77°49 35 4 2°37 
23 87 75°61 36 — 

2. 93 74°71 37 “75 
25 80 74°36 3. 2 \ 


From these results we find y?= 29°92 and P=:22. Thus more often than 
once in five trials we should get a worse divergence than the observed, if the - 
sample were taken from the calculated population. Some endeavour was made to 
better the fit by small variations from the above solution, discussed by least 
squares, but no improvement was effected. The two components are represented 
in Diagram VII (p. 132). 

(iii) 7. gambiense. 

Mean = 2271130, 
= 14°3389, ju, = 531°3585, 
= 29°1104, = 2429-0948, 
Reducing nonic : 


24q° — 71°7810q' — 30°5070q' — 300°0260q° + 869°6372q* 
— 278°8475q° — 270°9547q + 58:9108q + 146050 = 0. 
This leads to p,=—10g=—91777, and the components given in the Table 
p. 125. The two Gaussians and their compound are given in Diagram VIII 
(p. 133). We find y?= 11-96, giving for n’ = 18, P=°80 a splendid fit. 
(iv) Mzimba Strain (from Donkey). 
Mean = 21°4130, 
fy = 8°7531, = 293'5629, 
= 26°6602, = 1926°7045. 
The reducing nonic : 
+ 53°5186q' — 25°5876q° — 41°5706q° — 17126374! 
+ 227-1211q° — — 30°8995q + 86177 = 0. 
The required root is p, = — 10g = — 40000, which leads to the two components 
given in the Table on p. 125. The two components and their compound curve are 


| | | 


‘sn(pu puw oyut yo Aouonbarg oy} Jo ‘TTA 


ge 26 98 GE vS SE BE LE OF 6B BB LB 9% GS SS ZS 1S 0% 6L OL St vi ZL 0 
N 
\ 4 
| 

40% 
% 

$ 
= 
40¢ 
> 
409 
= 402 

406 


‘uodnyy tad savuanbasq 


17707, 


f 

4 

4 

| 


133 


‘snfpu “J, pus snumu asuarquybh Aouonbarg oy} Jo “[ITA 


ve €8 SE IS OF 8G LO 9B GB VS LG 0% 61 St OL PE St 
! 1 
= 
\ ' 40L 
= \ 
706 
' 
4OLL 
‘ 
i 
Jost 
iS 


Total 1000. 


Frequencies per Micron. 


134 A Study of Trypanosome Strains 


figured on Diagram IX on this page. 
P='26 a fairly reasonable fit. 


(v) Wild G. morsitans Strain. 
Mean = 22°6952, 
= 18-4918, 
= 43:0246, 


We have y* = 19°28, giving for n’=17, 


= 758'4420, 
bs = 39548788. 


\ 
\ 


= 

T 
Mean 


1 
L 
4 
! 
2 


Bas 


| 
| 
| 


14 15 16 17 18 10 


* Microns. 


O 21 22 23 24 25 26 27 28 29 30 31 


“32 33 34 


Diacram IX. Resolution of the Frequency of the Mzimba Strain into 7. minus and T. majus. 


210 
| 130 
120 
110 | 
; 100 
| 90 
| 80 

70 
50 
40 
30 
20 / 
— 


PEARSON 135 


Reducing nonic : 


— 224°61159’ — 66°6402q* — 595-9589q° + 5079°3305q' 
— 4500°7030¢? — 1460°5459q? + 879°6116q + 152°2340 = 0. 

The required root is p,=—10q=-— 475085, which leads to the components 
given in Table on p. 125. These components with their compound curve are 
drawn in Diagram X (p. 136). Here y?=92°75 which for 20 groups gives 
P< 000,000,1. Thus although the G. morsitans strain breaks up into two com- 
ponents the combined curve is not a probable description of the frequency. One 
would like to test another sample of this strain, at present it tells against the 
validity of our reduction. 

(vi) Wild-Game Strain. 

Mean = 226220, 
= 11°9310, = 404-4932, 
= 290514, = 2247-6657. 
Reducing nonic: 
— — 30°3834q° — 250°2869q° + 
+ — 212°3972q? + 1542229 + 144283 = 0. 

The root required is p, = — 10g = — 69859. There result the two components 

provided in the Table p. 125. The two components and their compound are 


figured on Diagram XI. (p. 137). We find x*= 12°61 giving for x’=19, P ='81, 
an excellent fit. 


(vii) Human Strain. 
Mean = 23°7963, 
fy = 170252, = 713°1660, 
= 27°1389, = 380341222. 
Reducing nonic : 
— 131°3796q' — 26°5147¢q° — 89°8059¢q° + 96441764" 
— 674°2755q? — 114°7894q? + 81°44929 + 9°5887 = 0. 

The root is given by p,=—10qg=—8'5576, which leads to the components 
given in the Table on p. 125. The two curves and their compound are figured 
in Diagram XII (p. 138). Although the two components merely from the 
graphical point of view do not give a bad fit, the number of trypanosomes in- 


volved is so large that the deviations are not reconcileable with random sampling 
from two such components. We find y* = 79°67, giving P < 000,001. 


| 
\ 

| 


A Study of Trypanosome Strains 


136 


‘snfpu “7, OFUT UTBIZG *H Jo Aouonber,g oy} JO “xX 


ge Ge ve GE IE OF 6B 8S 46 96 GG €6 GS 16 OG. 6t St Oi Gi vi El 


OOL 


006 


OOF 


00G 


sad hauenbaag 


0086 1710], 


F 
| 
| 
| 


KARL PEARSON 137 


In order to determine how far heterogeneity of treatment or material might be 
responsible we took further frequencies. In the first place we dealt with the 3600 
measurements for trypanosomes through the rat only. The frequencies are: 


15 16 17) 18 | 19 


36 | 37 38 Totals | 
| 


2h 25 | 26 | 


35 


| 


376 | 362 322 | 294) 235 219 sal 134 | 108 | 88 | 57 


| 


| 
112 | 161 | 216 | 290 316 


3600 | 


Total 2500. 


Frequencies per Micron. 


400 
- 
Ty 
350+ 3 
\ 4 
| 
200+— 
\ 
| 
Ban! 
0 14 15 16 17 18 19 20 21 29 93 24 25 26 27 28 29 30 31 32 33 34 35 
Microns. 


Diagram XI. Resolution of the Frequency of the Wild-Game Strain into 7. minus and T. majus. 
Biometrika x 18 


= —_|-|— 


LE 96 GE ve GE IS OF 63 8S LZ 9S GB HS BS IZ OS GI Bi Li Oi Gi I EL 
+001 
= | 
Joos 3 
> 
‘ 400v 
~ 400¢ 
{009 


138 


4 
44: 
4 
i | 
4 
> 
| 
j 
4 
4 : 


PEARSON 139 


These give : 
Mean: 24°6175, 
Hy = 15°25897, = 602°23008, 
ps = 19°21542, = 2023°21556, 


leading to the reducing nonic : 
249° — 80°8739q' — 13°2924q° — 42°3159q° + 306°5227¢q' 
— 166°4257¢q* — 2486549? + 12°6008q + 12081 = 0 


which gives po=— 10q = — 70031. 
This provides the two components : 
T. minus. T. majus. 
m, = 216772, m, = 269993, 
o, = 22404, o, = 32981, 
n, = 161118, = 1988°'82. 


The components and their compound are figured in Diagram XIII, p. 140, 
and we find for n= 21, 52°68 and P=-00016. There has thus been much 
improvement of goodness of fit, although the result is still unsatisfactory. 


It is impossible, however, to look through the graphs given by Sir David Bruce 
and others for the human strains* without being convinced of their fundamentally 
bimodal character, although there appears to be much evidence of its being 
disguised by heterogeneity of host and treatment. 


(viii) Diagram XIV (p. 141) gives the resolution for the human strain from 
Chitulukat. The constants 


Mean = 26172, 
fy = 23°52260, = 117930786, 
= — 3713226, Ms = — 3248°43805, 
leading to the reducing nonic: 
249° — 393'8678q' — 49°63709° + 520°29109° + 8226°9435¢* 
— — 101°1017q? + 855°7520q + 632383 = 0. 


The value of the root is p,=—10qg=—16:2295 and this leads to the com- 
ponents given in the Table p. 125, and illustrated in the diagram. The graph 
while giving broadly some of the features of the case is by no means a satisfactory 
fit; for n=21 groups, y?=86 and P is < ‘000,000,1. The diagram suggests that 
we are probably dealing with a mixture of three components with means about 
18°5, 25°5 and 31:0, but at present we have no satisfactory method of performing 
multiple resolutions of this character. 

* R. S. Proc. Vol. 86, B, pp. 285—302. 


+ R. S. Proc. Vol. 86, B, p. 291. 
18—2 


ms 


Trypanosome Stra 


y 


A Stud 


140 


‘snfow snuzw oyut A[UO YSnoryy ‘ureayg jo Aouenberg oy} JO NVUOVIG 


6s se GE VE SS LE OF 6B 8S LB YB GB SS 1S OS Bl LI GI 0 
4 
} / a 
! 
! 
‘ 
4 
\ 4 
ae 
00s 
5 
4 
4 
bs 


sad fiouanbaag 


1770,7, 


{ 
5 
j 
| 
| 
| 
| 
| 
| 
| | 
| 


8c LE GE VE SE GE IE OF 66 8G Ld 96 GG OS GI Bi Li Gi vi El 
| 
| ! § 
| 
| 
a 
j 
12 
4 
| 


e 


142 A Study of Trypanosome Strains 


It will be seen that the following strains, 7. rhodesiense, T. brucei, T. gambiense; 
the Mzimba, and wild game, give either reasonable or excellent results as combined 
frequencies of 7. minus and T. majus. On the other hand the G. morsitans and 
the human strains break up into reasonable pairs of components, but the goodness 
of fit test is not-fulfilled. In the case of the human strain, we better matters 
somewhat by taking the strain through the rat only, but the fit is still bad. If we 
confine our attention to a single human being, the case of Chituluka, we still do 
not get a satisfactory fit, although few statisticians could look at the four diagrams 
published by Sir David Bruce and others for Chituluka*, and not recognise the 
character of the material as being at least bimodal. The same applies to the 
Mkanyanga data of an earlier paper+, it is distinctly bimodal. But besides this 
bimodal character there are certain other features in the human data, and to a 
lesser extent in the G. morsitans, which appear to some extent to disguise the 
bimodal features. I am not prepared to assert definitely that this is the appearance 
of a third component. It is of course easy to improve the fit of the distribution 
by the introduction of such a third component, but the remarkable excellence 
of a bimodal resolution for 7. rhodesiense, T. gambiense, avd the wild-game strain 
makes me hesitate at present to adopt such an expedient. 


Owing to the courtesy of Sir David Bruce (who heard from Sir John Rose 
Bradford that I was much puzzled over the differentiation of strains) I have been able 
to examine a series of drawings of the various strains of trypanosomes. There is 
no other morphological differentiation which impresses itself @ priori on the layman 
and statistician, and which might serve as a new measure of the possibility of differen- 
tiation into 7. minus and T. majus. But it occurs to me that an index of breadth to 
length of the nucleus might just possibly serve as a differential character of even 
more importance than the length. It is only a suggestion and considerable caution 
would have to be used in selecting only nuclei not near the dividing stage. But 
it would be of striking interest to see how far the resulting frequency distributions 
for the nuclear indices were or were not bimodal. I think a classification according 
to nuclear index might possibly—to judge from the drawings—cut across the 
forms “intermediate” in length. But this is only a suggestion which may appear 


idle to the student of the subjectt. Some difficulty might also arise from the. 


doubt as to whether the index was really greater than 100, or the nucleus as 
a whole had set itself athwart the “length” of the trypanosome. This difficulty 
would certainly have to be considered in the “stumpy” 7. brucei and T. gambiense 


* R.S. Proc. Vol. 86, B, pp. 291 to 293. 

+ R. S. Proc. Vol. 85, B, p. 428. 

+ Several students of the subject with whom I discussed the matter stated that they considered the 
nucleus so mobile and so impermanent in form, that a ‘‘nuclear index” would prove of little value. 
I think much objection could a priori be raised to the use of the trypanosome “length” on the same 
grounds. The problem is rather, whether in dealing with large numbers we do reach an average type. 
It would only be possible a posteriori to justify the use of a nuclear index, i.e. if it were found to differ 
sensibly from one pure strain to a second, and if it confirmed in such cases as 7’. rhodesiense resolutions 
based on length frequencies. 


» 
& 
he 
| 
ont 
| 
| 
i | 
| 


PEARSON 143 


forms, but I am inclined to think that the index really passes through the value 
100. Undoubtedly this range of index, or possible athwartness of the nucleus is 
not conspicuous in the simple strains like 7. pecorum, T. simiae and T. caprae. 


Conclusions. (i) If appeal be made to statistical measurements, judgment 
between identity and diversity of strain must be formed by means of accepted 
statistical processes and not by mere comparison of graphs. 


(ii) Statistical processes show that the conclusions already formed as to the 
identity of trypanosome strains from mere inspection of the graphs cannot be 
confirmed. 


(iii) There must be some standardised process of treatment both in regard to 
host, and to method of and stage of infectivity at extraction. 


(iv) Even making allowance for differences due to host and treatment, we 
find remarkable divergences in the very strains asserted to be identical. 


(v) It would appear that some order would be brought into the chaos, if we 
could consider the strains described as 7. brucei, T. rhodesiense, T. gambiense, the 
wild-game, the Mzimba, and very probably the tsetse fly and the human strains 
as really consisting of two components, which for the time I have termed 
T. minus and T.majus. It is highly desirable that additional measurements should 
be made (? a nuclear index ascertained) to determine whether these lead also 
to similar components. 


I do not assume that this is a final solution of the problem, nor do I assert that 
T. minus and T. majus represent necessarily, although probably, distinct strains ; 
they may be dimorphic forms of one and the same strain occurring in different pro- 
portions. But,I believe, that the suggestion of their existence may help to explain 
some anomalies of the present chaos. I ought also to state quite frankly that this 
paper is not written in a merely critical spirit. I believe that the trypanosome 
workers have undertaken in their elaborate systems of measurements most laborious 
and most valuable work, but, I think, the time has now come when without 
trained statistical aid, but little further progress will be made in a very important 
and urgent matter. 


The very large amount of arithmetical work in this paper would never have 
got carried through had I not had the ever ready assistance of my colleague 
Miss Julia Bell; to Mr H. E. Soper also I owe help in the arithmetical work, but 
I have to thank him in particular for the careful preparation of the diagrams, and 
the planimetric determination of their frequencies by aid of which the x? for all 
but two of the compound curves was found. In the case of 7. brucei and 
T. rhodesiense actual calculation of the areas of the normal curves was used. 


| 

{ 

i | 


ON HOMOTYPOSIS AND ALLIED CHARACTERS 
IN EGGS OF THE COMMON TERN 


By WILLIAM ROWAN, K. M. PARKER, B.Sc., anp JULIA BELL, M.A. 


(1) Origin of the material and method of measurement. 

The settlement of Common Terns, which provided material for the present 
work, is one of old establishment on Blakeney Point, Norfolk. This is a shingle 
spit of some 8 miles in length on the north coast of Norfolk, about 12 miles 
west of Cromer. The colony is situated on the very end of the point, with 
water on three sides. Here the spit is a combination of dunes, salt marsh and 
shingle, and for the most part the nests are found on the open shingle on the 
seaward side of the dunes. Nests are plentiful in the embryo dunes in some 
years, though this year (1913) none were found there. The colony was more 
scattered than usual and covered the greater part of a mile of sea front. To 
avoid missing any clutches, Miss K. M. Parker, B.Sc., and Mr William Rowan 
divided the nesting area into suitable well marked plots and worked these one 
after another. Each of these again were worked in strips, till a patch was com- 
pleted, when the workers moved on to a remote one, to give the birds a chance of 
settling down again. After measurement each egg was numbered with indelible 
ink, so that any one egg was never measured twice. In all 203 clutches were 
handled. 

(2) Reduction of the material. 


The principal part of the work of tabling and reduction was carried out by 
Julia Bell*. The characters dealt with were: 


(i) Length of Egg ‘ ‘ ‘ L 
(ii) Breadth of Egg, maximum value . ; : : B 
(iii) Lateral Girth at section with maximum Gy 
(iv) Longitudinal Girth . ‘ ‘ Gi 
(v) Length-Breadth Index. B/L 
(vi) Mottling, as determined from a nile of silat eggs. M 
(vii) Ground Colour, as determined from a tint scale. , C 


* The authors have to thank Miss B. M. Cave for certain tables and their correlation coefficients. 
The Editor is responsible for the actual wording of this paper. ji 


| | 
| 

| 


W. Rowan, K. M. Parker ann J. BELL 145 


The Length of egg L may be considered as the easiest character to determine 
and needs no further comment. 


The Breadth of egg B should be closely related to the Lateral Girth G,, and 
in most cases the relationship G,=7B is very closely satisfied. If we sum and 
take the means we have 


a7 = Mean Lateral Girth/Mean Breadth. 
This gives in the present material : 
= 3'224 as against 3142, 


which marks an error of about 2°6 %, rather larger than we might anticipate, and 
possibly due to the inclusion of a certain number of slightly damaged eggs, and 
the measurement of the eggs in the field and not in the laboratory. The relation 
between G, and B is a useful test of accuracy and should be determined with a 
slide rule before the egg is finally replaced in the nest, or lost sight of. 


The Longitudinal Girth G;, is somewhat more difficult to measure, and a rough 
test of its accuracy not so easy to determine as in the case of G,. We have, how- 
ever, developed a formula for determining G; in terms of B and ZL, and .on testing 
it we find that as a rule the differences are below 15mm. Such a formula may 
be useful as emphasising the need for remeasurements, when the observed and 
calculated girths have values much in excess of 15mm. We are not prepared 
to say, however, that the coefficients in this formula can be extended beyond 
the case of the Common Tern. 


While the Length-Breadth Index is valuable as giving a measure of the 
ellipticity of the egg, it is not of much influence on the apparent oval shape, 
unless we suppose some theoretical geometrical construction for the egg. If we 
suppose the blunt end of the egg to be approximately spherical, the hemisphere 
ending with the maximum breadth, then the egg might be considered as divided 
into two portions, the upper or hemispherical with radius $B and the lower with 
length from the base of the hemisphere (or ‘ equator’) to the lower pole = L—4B. 
The ratio of these two segments of the length depends only on the index B/L. 
Thus it is conceivable that this index has actually as much association with 
ovality as with ellipticity, although without some geometric theory of egg-shape, 
we are not able to make any dogmatic assertion as to the value of B/L. It 
seems, however, a character of considerable interest as being free of absolute size 
and also some measure of shape. If J =B/L and O be the ratio of $B to L—4$B, 
ie. O “7_ BI = ig I/)(2 —I), we may consider O a measure of the ovality, and we 
have correlated O for eggs of the same clutch as well as J. Of course, since O is 
a function of J, there will be relatively little difference in the results. 

Biometrika x 19 


146 On Homotyposis in Eggs of the Common Tern 


The mottling is a far more difficult matter for determination. The points 
which may be considered are: 


(i) Size and shape of individual splodges. 
(ii) Portion of the egg over which these splodges are distributed. 
(iii) Area of mottled surface as compared with whole area of the egg. 


The fieldworkers selected 9 typical mottlings (see Plate IX) and named these 
a, b,c, d,e, f, g,h, 7; they then compared each recorded egg with these and selected 
the letter which marked the egg on the scale most resembling the egg to be 
recorded. There is little doubt that in this manner they divided the whole series 
of eggs into differentiated classes. But it may be doubted whether the judgment 
made depended on one only of the above three characteristics. Hence when we 
came to arrange the eggs a, b, c, d,...h, i on a scale of mottling, we found that the 
order would not be the same when we classified in turn by each of the three 
characteristics. We endeavoured to place the eggs in order by extent of mottling, 
i.e. by (iii), but we think that the relatively low value of the homotyposis which 
has resulted is possibly due to size and shape of the mottlings, (i), having had 
as much influence on the classification as the extent of area mottling. Even 
position on the egg, (ii), can influence judgment considerably. We believe that 
in future work on eggs, it would be desirable to classify the mottling of each 
by using the three characteristics independently. Even then an ocular appre- 
ciation, as this must be, may fail to give a very close measure of the nature of the 
mottling and thus weaken any homotypic correlation. 


The Ground Colour of these eggs varies through all shades of brown to 
brownish greens and blue-greens. The fieldworkers attempted to give the value or 
depth of ground-colour pigmentation without regard to the brown or green shade 
of colouring. The scale of values is given at the foot of Plate VIII. 


A point seemed worth consideration: assuming the pigments to be deposited 
on the egg in its passage through the oviduct, it was conceivable that greater 
pressure might indicate greater intensity of pigmentation. We accordingly 
selected the broader egg in each clutch and investigated for every pair of eggs 
from the same clutch whether the broader or narrower egg had the larger mass 
of mottling and greater density of ground colour. We reached the following 
results : 


The broader egg in every possible clutch-pair has 


Greater mottling in 26 cases | More dense ground colour in 25 cases 
The same _,, The same 39. 


Perhaps not very much stress is to be laid on these results, but they suggest 
that the total amount of pigment deposited is less the broader the egg, i.e. for the 
same bird a relatively smaller egg will be more pigmented, A solution of this 


| 
| 


Biometrika, Vol. X, Part |. Plate VIII. 


Cambridge University Press 


SAMPLE EGGS,GOMMON TERN, NATURAL SIZI 


> 
1. & 
‘ 
wees 
| \ te 
) 
Se 
“Sk 
— 
Le» e 
® 5 aa 7. 
A 
4 a C I l k 
Colour Va Ocale 
| 
| 


a 
| 
4 


Biometrika, Vol. X, Part | Plate IX 


TYPES OF MOTTLING OF EGGS OF COMMON TERN. 


| 
a 
d 
g h 


| 
a 
“Ga 
4 


W. Rowan, K. M. Parker anv J. BELL 147 


rather unexpected result may, perhaps, be found in the suggestion that the total 
amount of pigment is the same in both eggs, but the mottling and ground colour 
will appear denser on the smaller surface of the smaller egg. The point deserves 
consideration on the basis of larger numbers and possibly better defined measures 
of pigmentation. 


(3) Means and Variability. 

Table I gives the means, standard deviations and coefficients of variation of 
the several characters studied. It will be seen that the tern’s egg ‘as for 
quantitative characters relatively small variation. The values of the coefficients 


TABLE LI. 


Means and Variabilities (Absolute Measurements in Centimetres). 


| | Coefficient 
Standard 
Character | Mean ere of 

| | Deviation Variation 

| 

| Length Z | 4144007 "180+ 4°34+°12 

| Breadth B 2°98 + 004 099 + 010 3°33 + 

| Girth G,... 11°39 + 015 376+ °010 3°30+°09 | 
Girth G;, 9°59 + 347+ 010 3°62+°10 | 
Index B/L 72°04+°136 3°449 + 096 
Index of Ovality, O* ... 56°35 +°171 4°334+°121 7°69 + | 


of variation are less than many of those which we find for the human skull 
(3 to 8), but greater than those we know for the wing of the wasp. It is very 
doubtful whether the coefficients of variation of the indices should be included 
in such considerations, for the object of the use of these coefficients is to get 
rid of absolute lengths, and this is already done in the case of indicest. It is 
noteworthy that the length of the egg is only slightly more variable than the 
breadth and the breadth-girth is actually more variable than the length-girth. 


(4) Correlations. 


If we turn to the correlation of characters in the same egg, we note that 
while the ordinary product-moment correlation 7 has been calculated for all 
measurable pairs of characters, this is not possible for the ground colour or the 
mottling. Where mottling has been used with a quantitative character there » 
has been calculated and both corrections used. Where mottling has been con- 
sidered in conjunction with ground colour, there we have adopted mean square 
contingency correcting for both number of cells and for class-index correlations. 


* O=(B/L)/{2-(B/L)}. 
+ For example, if we take 1/0 for our index of ovality its mean =176-32, the standard deviation 


=11-24 and the coeflicient of variation =6°38. Is O or 1/O the more variable? It does not seem that 
the coefficient of variation can help us in such a problem. 


19—2 


| 
| 
| 


148 On Homotyposis in Eggs of the Common Tern 


Certain facts are at once obvious from this Table, others are obscured. In 
the first place length and breadth of the egg of the Common Tern have a rela- 
tively small relationship, while the relationship between the two girths is between 

TABLE II. 


Correlations of Characters in the same Egg. 


Characters Symbols | Correlation Remarks 


Length and Breadth L, B 


Longitudinal and Equatorial Girths G,, Gy +°5297 + 0284 
Length and Longitudinal Girth 

Breadth and Longitudinal Girth ... | By, G, +°5216+ 0286 
Index and Longitudinal Girth 

Index and Length . 
Index and Breadth . 


L , G, +°8804 + ‘0088 


B/L, G, — 3832 + 
B/L, \—*7284+-0185 | 
B/L, B | + 5033+ 0294 


Mottling and Ground Colour we M, C + ‘2260 (corrected C) | More mottling, deeper ground colour 
Mottling and Index .. x ... | M, B/L |\—+1550 (corrected yn) | Less mottling, higher index 

Mottling and Breadth | — +1803 (corrected n) | Less mottling, greater breadth 
Ground Colour and Index ... ...| @, B/Z | +0000 (corrected n) | No relationship 

Ground Colour and Breadth lc, B  |—*1506 (corrected n) _-Fainter ground colour, greater breadth | 


two and three times as great. This probably flows from the consideration that 
the correlation of G, and G@, arises from B being a factor in both and only 
secondarily from the correlation between Z and B. The correlation of the 
longitudinal girth with egg length is 60% higher than that of longitudinal 
girth with egg breadth; both these correlations are more substantial than that 
of the longitudinal girth, G,, with the egg index, B/Z. The egg index correlated 
with length is large and negative, and with breadth considerable and positive, 
precisely the results we should anticipate would appear if the correlation were 
largely spurious*. 

In order to ascertain how far it was possible to predict the longitudinal girth 
from length and breadth, double (for Z and B) and triple (for ZL, B and B/L) 
regression formulae were worked out. The following equations resulted : 


(i) G,—G,=12701(B- B)+1-6415(L—-L), 


or, G, = 12701 B+ 16415 L + ‘8224, 
and (ii) G, — G, =- 17-2930 (B — B) + 14°6374(L — L) + ‘7636 (1-1), 
or, G, = — 17-2930 B + 146374 L + ‘7636 B/L — 52°7239. 


The first seventeen eggs were taken as a random set to test these results upon 
with the following values: 


* As a matter of fact the correlation of index and length for a constant breadth is —-997 and 
of index and breadth for a constant length is +-996 instead of unity. These values indicate how closely 
the linearity of regression holds in these quantitative measurements. 


4 
| 
| i 
| 
| 
ate | 
oA 
“4 & 


W. Rowan, K. M. Parker Ann J. BELL 149 


TABLE III. 
Observed and Calculated Longitudinal Girths. 


Caleulated Girth Difference 
Egg Observed | 
Number Girth | 
1 11-40 | 11714 | 1120 | + 26 | 
2 11°65 | 11°83 11°74 —' — 09 
3 1210 | 1224 | 12°07 - 14 | +03 | 
4 10°80 | 11°46 | 10°84 — -04 | 
5 11:70 | 11°23 | 11°31 + 47 | 4°39 | 
6 11-20 | 11°27 | 11°34 07 "14 | 
7 12°15 | 13°19 | 12°31 -104 | --16 | 
8(i) | 1120 | 11519 | 11°97 | + 01 | | 
8 (ii) 1130 | 11-09 | 11-27 | + 21 | 4°03 | 
9 (i) 11°50 | 11-44 11°61 + 06 | | 
9 (ii) 11:40 | 11°36 11°45 + 04 | -05 | 
10 1150 | 11°52 | 11°61 
| 11 11°80 | 11°55 | 11°72 + 2 | 
| 12 11:90 | 11°62 | 11°74 + 28 | 
13 (i) 11:10 | 11°01 10°94 + 09 | 4°16 | 
13 (ii) | 10°80 | 1075 | 10-78 + 05 | +02 | 
| 18 | 11°70 | 11°45 | 11°55 +25 | 4°15 
Root mean 


To judge by this small sample we obtain only increased inaccuracy by taking 
the more complicated formula. We shall only make an error of about 14 mm. if 
we calculate the longitudinal girth from 


G, = 1:2701 B + 16415 L +8224, 


and for the egg of the Common Tern at least this is a convenient formula for 
verifying measurements in the field. 


The remaining correlations indicate sensible correlations, but these correlations 
might well be substantially higher had a better scale of mottling been adopted 
ab initio. In the first place we see that the mottling and the ground colour 
are sensibly correlated, and the deeper the ground colour the more intense is 
the mottling*. 


We have already seen (p. 146) that for eggs of the same clutch the broader 
has less intensity of ground colour and more meagre mottling. This is true 
for the eggs of the Common Tern in general, although it is probable that a better 
classification of mottling would bring out more marked correlations. The 


* This might probably be asserted interracially as well as intraracially, compare for example the 
swallow with the skylark, the lapwing with the ringed plover, etc. 


2 
| | 
‘ | 
| 
| 
i 


150 On Homotyposis in Eggs of the Common Tern 


following are the orders (a) of mottling chosen, (b) of breadth classes, (c) of 
index classes : 


(4) (b) 
of! Order of Breadth Order of Index 


Class 


Class 


Class 


gtet+d a 
a 


| 
3-00 a | 72-64 
2-99 fri | 72°54 
b gtet+d 2°97 72°30 
fri 296 | gtetd | 72°27 
h h 2-96 ho | 71-95 
2-95 b | 70°54 


Mean 


| 


72°30 


The relationship is small, but exists. It seems reasonable to suppose that 
the order of mottling classes as given by B or B/L, where there is only one 
displacement, may be a better one than that we have selected. But if in the 
mottling order b and c were interchanged, it would agree with the B classification, 
in so far that the three classes of least and of most mottling in the two classi- 
fications would be the same. 


We now turn to the ground colour. We see that the ground colour is 
fainter, when the egg has greater breadth, but that there is no relation of the 
index to the intensity of ground colour. The results of p. 147 are thus confirmed 
by the general correlation of ground colour and breadth. Although there is no 
high-correlation, we may assert that it is probable that the intensity of pigment 
dees not depend on the pressure during transit of the oviduct, but rather on 
a constant amount of pigment being distributed over a larger surface. 


(5) Homotyposis in Eggs of the same Clutch. 


The homotyposis, or degree of resemblance in character between eggs of the 
same clutch may be studied on the present material. The chief direct and cross 
homotypic correlations are given in Table IV. 


Pearson has shewn* that the degree of resemblance of undifferentiated ‘like 
organs’ might be expected to be equal to that of pairs of brethren, i.e. about ‘50, 
and proved that this is so fo. :aany homotypes in the vegetable kingdom, a result 
which has been since confirmed by much as yet unpublished material from the 
animal kingdom, including a number of series of birds’ eggs.’ Thus the mean 
value of the homotyposis for eggs of the Common Tern could hardly be improved 
upon. Only the colour characters show irregularity, especially the mottling, a 


* «On Homotyposis in the Vegetable Kingdom,” Phil. Trans. Vol. 197, A, pp. 285—379, 1900. 


= Pam | | ] 


W. Rowan, K. M. Parker and J. BELL 151 


feature we have already indicated as difficult to measure. It will be seen that 
the correlation of the ground colour of an egg with the mottling of a second 
(3989) has come out greater than the organic eoineabiciiicn between mottling and 
ground colour in the same egg (2260). 


TABLE IV. 


Homotypic Correlations. 


| 
; Symbols Characters Correlation 


Lengths of Eggs in same clutch ... | 4643+ °0346 
Breadths of Eggs in same clutch . | 5176 + 0326 
Longitudinal Girths of Eggs in same clutch | 3076+ 0327 | 
Equatorial Girths of Eggs in same clutch ... | 4621+ | 


Mean value sti sve “4879 


Mottling of Eggs in same clutch ... “3500 
Ground colour of Eggs in same clutch ... “5709 


Mean of six characters ... “4788 


| 


{ 
{ 
| 


Length of one Egg with Breadth of a second ... an 0922 + ‘0441 
Ground colour of one Egg with Mottling of a second + 
Length of one Egg with Longitudinal Girth of a second... | 4229 + 0362 | 
Breadth of one Egg with Longitudinal Girth of a second... | *2530+ 0416 | 
Longitudinal Girth of one Egg with Equatorial Girth of a second | ‘2603+ °0413 | 


Indices of two Eggs of same clutch ... | + °0308 
| Indices of ovality “of two Eggs of same clutch .. ane ... | 5527 + :0309 
Inverse of indices of ovality ... | 5361 + | 


Mean of three Index Correlations —... D475 
Mean of nine Homotypic Correlations 5017 


We feel that the classification by mottling is at present too uncertain, and 
that until the result cited has been confirmed with larger numbers and more 
definite categories, it would be idle to consider whether, while a given bird has 
usually highly or lowly pigmented eggs both as to ground colour and mottling, 
yet when in the individual egg there is an excess of mottling pigment, there 
may be some tendency to a relatively less increase of ground colour. Thus the 
correlation in the individual egg might possibly be less than the correlation 
between eggs of the same clutch. Such considerations must be postponed until 
the fact itself is adequately demonstrated, 


| 
| 
| 
B, B 
Gy, Gy 
| | 
; L,B 
| CO, M 
| Cross | L, G | 5 
| B, Gy | 
Gy 
; B/L, BIL 
| Index | O, O ; 
1/0, 1/0 
|— 
| | 
| 
| | | 


152 On Homotyposis in Eggs of the Common Tern 


Another relation suggested by Pearson* is that the cross homotypic corre- 
lation of the characters « and y should on the average equal } (correlation of 
«x and « + correlation of y and y) x (the organic correlation of # and y). It is 
clearly impossible from what has just been said to apply this to the cross 
homotyposis of ground colour and mottling. We can apply it to the five cases 
in which quantitative measurements have been made. Table V_ gives the 


requisite data, the last two columns giving respectively the calculated and 
observed cross correlations. 


TABLE V. 


Cross Homotypic Correlations. 


Characters Direct Correlations | Gross Correlation 
Organic Correlation 


| (1) and (2) : 
(1) (2) | (1) and (1) | (2) and (2) | Caleuluted | Observed | 
| 
L | B | 5176 2220 1090 0922 
4643 “5076 8804 | -4978 | -4999 
5076 4621 5297 | 9568 -2603 
| B | G@ | °5176 076 "5216 | "2530 
| 
| G, | BIL | “3076 ‘5537 | — 2033 | — 2007 


When we compare the calculated and observed cross correlations, we see 
a striking agreement, or the theory that cross homotyposis is the product of 
direct homotyposis and the organic correlation of the characters under investi- 
gation holds very closely for the egg of the Common Tern. 


The general results obtained are in good accord with those reached by previous 


observers, and the authors hope to investigate one or two doubtful points on 
fuller material this year. 


* Phil. Trans. Vol. 197, A, p. 290. 


4 
| 
| 
i 
j 
“A 
ag 
| 
: 


W. Rowan, K. M. Parker anv J. BELL 153 


APPENDIX OF CORRELATION TABLE 
TABLE Length and Breadth of Egg. 
Breadth. 


355-359 | —| 1 | — 
$65—369 | —|—| 1 | —| 1 
s75—879 —|—|—| 1] 1 5 
-- | — | — 1} 1 | 5 
385-389} 1 | —|—, —|—]|] 3] 1] 3] 2)--| — —| 1 
| 395-399] —|—|—|1]—] 8] 4] 6] 4) 4] 1); -|—] 
| 400-404} —| —| —| —] 2) 5] 6] — 26 
4 | —| —| —| 1] —] 13] 12} 8) 5] 2) 3] — 45 
—| —|—|—|—]| 8] 4] 9] 38] 2] 3] — 24 
—|—|—|—|—] 1! 1] 5] 5] 8] 6| 2} 1 31 
£25—5-29} —| —| 2] 4] 4] 4] 5] 23 
| 435—4'39 | — —|/—|;—; 1) 83] 4] 83] 14 | 
| | —| —| —| —| —| 0 
Totals 1/2) 1 2/4 | 14/24] 61 | 61 57 | 35 | 20 10) 2 294 | 
i 


TABLE B. Girth L, and Girth B. Girth B. 


| 1010-1019 } — | —| —|—|—|—|—|-| o | 
| 1030—10389 |} —| —|— 0 
| 10-40-1049} 1 |—|— —| 3 
| 1060—1069} —| —| — —|— | 1 
10-70—10°79 | — | —|— 1 4 
j | — | —| 8] 2] 8) 8) 13 
© | 11:10—-11:19}] —| —| —|—]| 2] 5| 9) 5) 
| 11-20-1129] — | -—|—|--| 2] 8| 31 
11°30—11'39 —| —|— | 1 3] 7] 
1140-1159 3] 2 | —|— 28 | 
| 2170-1179] —|—|— 1] 3] 2| 6! 4) 1 | —|— — 17 | 
| 1180-1189} —|— |— 1] 3] 2) 4) 5) — 17 
| 11:90--11:99}] — | —| —| — | 12 | 
| — | —| —| —| —|—|—|—| 3] 
| 120-1229 | — | — | — | 1}—|- 
Totals 5 44 | 84 | 64 | 42/18 | 7 1|4 21 | 
Biometrika x 20 | 


| 
i 


| 4 re | | or | | | ot | ot [ J, | 
| 
SI SIS SILI SIS i sig sisi si fis 
pun bby fo 
‘0 ATAVL 


; 
a 


W. Rowan, K. M. Parker and J. BELL 155 
TABLE D. 
Breadth of Egg and Girth L. 
Breadth. 
& | & | & RQ i » | % 
Sic in Le R 
| 
1010-10-19 } — | — | —| —| —|—| 0 
10-20-1029} 1 | —| 1 | 2 
10-30—10°39 -- | —| —| —|— —|-|-|-— ~ 0 
10-4o—1049 —| 1 | 3 
| 10°50—10°59 | — | — | — 2 
10a0—1060 — | —|—|—i—| 5] 1| 3] 1] 8] —j|—|—J— 13 
nj | 1090-1099) —|—|—|1/—| 1] 4| 4] 3] 5} -—|—|—|— 18 | 
| 1100-1109] 1] 1] 2 —|— 16 | 
| 11-20-1129] —| —| —| —|—) 1] 7) 10] 6] 4] 2) 1] 32 | 
11-40-1149] 3| 13] 8] 6] 1;—|— 40 | 
11°50—11'59 |} —| —|—|—j—|—| 2 6| 4) 1) 28 
— | 4) 4] 8] 7] 4] 
|} — | —| —| —|—|—]| 1] 2}]-2] 1] 8] 2) 1 17 | 
1190—11':99 |] — | — | -|—|—;—]| 3] 2) 3] 1] 1) — 12 
12-00-1200} —|— | 2] 6 
12:10—12:19 —| —| —| —| —|—|—]| 2] 6 
1 
12°30—12:39 |} — | — | —| — |— | —|-—|-|- 
i] 
Totals 1 57 | 35 | 20| 10! 2 294 
20—2 


| 


TABLE E. Girth L and Index 100 Breadth/ Length. 
. Girth L. 


156 


On Homotyposis in Eggs of the Common Tern 


Totals 


| 60-61 —00-6T 


66-TI—06-TT 


1L-4—OL-4 


49.4¥—09-4 


| $ 

| | tll i ge 
|} |} Lim lI 
| 

| 


| 


| 


te.4—0¢-4 


40 | 98 | 28 | 17 | 1 


22 


32 | 


| 


Length. 


69-4¥—E6-4 


644 


68-4—98-4 


66-4--G6-4 


61-4 —SL-4 


| 


te4—0e-4 


66-6 —96-6 


¥6-E—06-¢ 


| | | | 


TABLE F. Length of Egg and Indew 100 Breadth/ Length. 


o|3|2 


| 
a 
| 6&-11—08-T1 | | | 
| | | | | 
00-01—06-01 ¥ 
| 
ddd 


Totals 


68-E—G8-E 


64-6 GLE 


L-8—OL-E 


69-8 


69-6 —GS-E 


til 


| || 


fate 


2 | 


0 


5] 6 | 


10] 25 | 26 | 27 | 45 | 24 | | 23 | 20 | 14 


1 


4 
| 
ag | | 
| | 
| 
| ~ 
|| 
4 
| | | | |< 
“J 
| 
| 
ia 
| 
‘xopu 
*xopuy P I 


W. Rowan, K. M. Parker and J. BELL 157 


TABLE G. 
Breadth of Egg and Index 100 Breadth/ Length. 


Breadth. 
\660—6791 1/1] 1) 4] 9] 8] 
% | 680—69-9] — — — 
|700—719] — | —| —| —|—]| 4! 22) 8] 
— 1] 1] 4 
| j | 
| Totals J 1 2 4 |14| | 61 | 57 | 35 20 | 10 2 | 294 
TABLE H. 
Ground Colour and Mottling. 
Ground Colour. 
| ath d | e g—k Totals 
| 
| gte+d 12 15 4 16 ae 73 
| (41°21) | (43°96) (—5-28) | (+ °70) | (—4°55) | (43°96) 
@ 4 5 ay 26 
(+716) | (41-07) (+ -69) | (—1-45) | (— | (+ 
(—4°24) | (41°59) | (—3°23) | (43°73) | (42°56) | (— 
12 il is | 18 99 | 13 101 
(~2°92) | (—4:27) (45°16) | (—3°17) | (47°48) | (- 2°27) 
h 5 13 
(48°08) | (-1°97) | (+ °35) | (+ °28) | (-2°7) | (+1°03) 
fri 7 4 2 29 
(42°71) | (— °39) | (42°31) | (— | (— 2°18) | (—2°39) 
| 
| 
Totals 43 | 44 | 37 61 62 | 44 291 


- 
a 

| 

it 

| 

iH 


Eggs of the Common Tern 


osis in 


On Homotyp 


TABLE J. 
Index 100 Breadth/Length and Mottling. 


Index. 


|| lal 
6-18—0-08} | | |ala 
1S | 
AP 1S | 
6-69—0-89] PSO TS 
6290-49] | | 
6£9—0-29] A | ae | | 
© | 
+ 
S 
+ | 
| 


TABLE K. 
Breadth of Egg and Mottling. 


vo, 


Breadth of E 


| 
| 
| | = 
2°2n 
16-2—06-6 
| | wi: | 
| 
so 3 


h 
| 
| 
| 
| 
| 
7 


W. Rowan, K. M. Parker and J. BELL 159 
TABLE L. 
Ground Colour and Index 100 B/E. 
Ground Colour. 
i | & | Totals: 
66°0—66'9 | 2 2 | 3| 3 14 
680—689 1/ 2; 3) 2! 4) 1) 24 
700-709 | 1) 6| 5) 4) 7)10|—| 3 
770-719 | 2) 1} 5) 6] 4) 1, 37 
| 70-789 | 1| 2| 5] 6| 1 23 
140-749 2] 4! 7] 5) 26 
0-759 | 1) 1) 5) 5) 5) 3) —|—] 28 | 
760-769 1 1; 1! 1] 4] 2 13 
| 7o—779 1] 1] 2] — Pex 9 
| 790-799 | 1] 2 
| 80°0—80°9 | | — | — | — | — 3 
820-829 |— 1] —j|—|—|-|-|- 
| | | | 
Totals 12 | 32 | a4 37 | 61 | 61 | 19 is| 5 | 1] 290 | 
| 
TABLE M. 
Ground Colour and Breadth. 
Ground Colour. 
a | b | e | d | e | f | g\h | z | k | Totals | 
| 260—264 —|—|-|— Lp | 2 
270—2T 4 2 
3| 3] 8; 2, 2/- 26 
S| | 4/ 6/12) 7] 3) 6|—|—] 59 | 
| 2:95—2:99 31.8) 8 60 | 
| soo—soh 7/11! 10/10/10! 6) 1 
| gos—so9 | 3! 4| 2/10} 8| 
| 4| 3] 6] 3|/—|—) 1 20 
2] 1] 1} tir 10 | 
| Totals 12 32 44/37/61 | 62/20] 18] 5 | 1 292 


160 On Homotyposis in Eggs of the Common Tern 
TABLE N. 
Breadth of Egg in Pairs of sume Clutch. 
SSIS SiS s | 
Ni Rl | 31H | 
1 | —| 1 
| 
2] «| 2} | 
| 3] 4] 4] 5] 1}/—|—|/—] 16 
| 2:90—2:945 —| —|—|1] 4] 4|10]14] 8] 5] 1)— 47 
| 295-299 —{|—|—|1] 3] 5| 14/16] 13} 5|—|—] 55 
| —|—|—]| 1] 1] 8| 8| 2] 52 
| 305—3-:09 —|—|—!—|—]| 5] 5] 8} 4] 3}; — 25 
310-8144 —|—|—| 3] 4|—] W 
| Totals | 1 | | 2] 12] 16] 47 | 55 | 52 | 25] 17 | 2 230 
TABLE O. 
Length of Egg in Pairs of same Clutch. 
' | Is S S S > > | S S 
 —| — | 2 
4| 2 1) 1) 1] 12 
— |— | 2] 3) 4] 4) 4) 2B] 1] 1] 
| —|—; 1); 1, 2) 1] 1] 2] 5] 2] 2] 2] 8 23 | 
| 254291 4] 1] 3) 1; 12 | 
| 1] 1] 4] 2! 4 | 
44-4491 
Totals | 1 | 2 | 12 9 | 24 | 28! 24 | 36 | 24/23! 12 | 20] 14] 1 1 | 3 | 234 


10° 
10° 
10°. 
10° 
10° 
10° 
10° 
10° 
11° 
11° 
11 
if 
11 
11 
11 
11 
12 
3 

{ 


W. Rowan, K. M. Parker anp J. BELL 161 
TABLE P. 

Girth B in Pairs of same Clutch. 
ISIS 

8:20— 8:39 | 2 FS 9 
9-00— 919 |—|—]| 4] 3] 6] 1] 2 
9:40— 9°59 1] 6/18!290/14| s8;—|—|—|1] 68 
10-40—10°59 | — | — | — 0 
| Totals 2 | 0 | 2|16| 68| 57/31/11] 3] 0 | 3] 220 
TABLE Q. 
: Girth L in Pairs of same Clutch. 
ais | 

i#10—1¢19] — | —| 4]-|-|-|-] 2 | 

Totals | 1/|0/|1 1fife|u 15 | 18 | 21 | 25 | 23 | 32 | 21 | 20 18 | 11 10, 1 | 2 | 234 

Biometrika x 21 


| 
| 


162 


On Homotyposis in Eggs of the Common Tern 
TABLE R, 
Mottling in Pairs of Eggs of same Clutch. 
gtetd a b e h =, f+i | Totals | 
gte+d 26 iat oe 19 4 3 68 
| (+5°54) | (- | (42°37) | (-5°97) | (— | (- 
(— | (+2°87) | (-1-27) | (+1°12) | (— -99) | (- 
12 9 1 32 
(+2°37) | (—1-27) | (43°47) | (—2°75) | | (— °84) 
| 19 40 5 3 83 
(—5°97) | (41°12) | (-2°75) | (+9°52) | (- 14) | (—1°77) 
Be 4 5 2 2 14 
(= | (- | (— | (- | (41°13) | (41°19) 
fri 3 — | 1 3 es 13 
(= | (- -92) | (— °84) | (—1°77) | (41°19) | (43°25) 
| Totals 68 16 32 83 226 
| | | 
TABLE 


Mottling. 


Ground Colour of one Egg with Mottling of the other Egg for 
Pairs of same Clutch. 


Ground Colour. 


| | 
a+b | e | d | e | f | g—k | Totals 
gte+d 8 n | 13 | 8 16 | ul 67 
(— 3°67) | (41°43) | (43°73) | (-5°16) | (+ °75) | (+2°92) 
(—1°96) | (—1°43) | (— 2°35) | (+2°66) | (+ | (+2°95) 
(-3-40) | (+ °57) | (— | (—3°09) | (+4°94) | (+1°26) 
c 17 10 10 | 24 15 6 82 
(42°72) | (—1°71) | (- 1°35) | (+7°89) | (-3°67) (-3°88) 
h 8 1 | — 14 
(45°56) | (—1-00) | (— -94) | (-2°75) | (4+ °81) | (-1°69) 
fi 3 4 13 
(+ °74) | (42°14) | (41°20) | (+ -45) | (-2-96) | (—1°57) 
Totals 31 44 51 |? 224 


In Tables R—T, the contingency of each cell is given in brackets. 


4j 
i 
| » 
| 
| | 
4 


Breadth of Second Egg of Pair. 


W. Rowan, K. M. Parker anp J BELL 163 
TABLE T. 
Ground Colour in Pairs of Eggs of same Clutch. 
a+b d e g-k | Totals 
| 
a+b 22 7 5 Se 1 39 
(+15°15)| (+1°55) | (- +45) |(— 7-08) (—5°96) | (—3-22) 
e 7 6 6 4 | 4 4 31 
(+ 1°55) | (41°67) | (+1°67) |(- 2°42)| (—3-12) | (+ °65) 
| 
d 5 6 8 en ee 1 31 
(— (41°67) | (48°67) |(4 (— 3°12) | (~ 2°38) 
4 7 2 | #12 2 46 
(— 7°08)| (—2-42) | (4 °58) | (410-47)! (41-43) | (—2°97) 
f 3 4 4 12 20 s 51 
(— 5°96) | (—3-12) | (—3°12) |(+ 1:43)| (+8°28) | (+2°49) 
g-k 1 4 1 2 8 8 24 
(- 3-22) | (+ °65) | (—2°35) |(- 2°97)| (42°49) | (+5°41) 
Totals} 39 31 31 46 51 24 222 
TABLE U. 
Length of one Egg with Breadth of the other Egg for Pairs of same Clutch. 
Length of one Egg of Pair. 
% | | | & 
sis) si 3 3/3/72 
255—2°59 | — | — | — | —|— | _ 1 
1| 3} 3] 3] 16 
290—294)—| 1] 4] 1] 7; 3; 6] 7] 2] 5]-2] 5] 3} 1 47 
295-2991 1|—| 2] 8, 9] 6| 4] 3] 5] 1] 55 
2| 2] 5| 4] 5] 4] 1] 52 
$05—3:09 |} —-|—|—|—| 4] 6| 2] 4]-2] 3] 2}—] 2 
1] 2| 3} 3] 1] 1] 4/—] 
$15—819} —| —| —| - 2 
Totals | 1 | 2|12| 9| 24 | 28 24 | 36 | 22 | 93 | 12 18 | 1 1} 3 230 
21-2 


| 
| 
| 
| 
| 


164 


2 
1S 
—| 
| 3 Jo 
| 


if 
4 
| 
| 
é 
7 
3 
4 
Tor 
2 


W. Roway, K. M. Parker anv J. BELL 165 
TABLE X. 
Girth L. and Girth B. in Pairs of same Clutch. 
Breadth Girth of First Egg. 

cs) | Sis 

| 10°80—10°99 | — | — | — 26 
11°20—11:39 | —| —|—|—| 3] 8 4/—|—|1] 48 
11-40-1169 | |— |—| 2 | 3] 4) 14/17] 6| 
1100-11789 §— | — 3] 8] 71m] 812 | 38 
| 11°80—11:99 | —| —|—|—| 2] 3} 6 1)/—|—|—] a 

=] | 
Totals 2 0} 0 | 2 | 16| 37 | 68 | 59 232 
TABLE Y. 
Index 100 Breadth/Length in Pairs of same Clutch. 

62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 7 | wh | 75 | 76 | 7 78 | 79 | 80 | Totals | 
69 15] 3 | 15] 15] —] -- | 48 —|— | 
1-;—{/—| 6] 15] —] 3 | 28) | — | — 
71 15! 3 | 7 75) 5 | 3 —|—|—|]—)| —] 22s | 
7s 5 5 3 | | 23 | 
—| —| — 2 | 15] 65] | 3| 3 22 

| 
Totals} 1 | 0 | 2 | 2°5 105 | 11°5 | 32°5 | 49°5 | 23° | 26 | 22 145 | 10 |2 | | 230 
i 


| 
| 
| 
| 


166 


On Homotyposis in Eggs of the Common Tern 


HOR MOHD 


a) 
| |SReogeom 


58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | Totals 


| 


TABLE Z. 


MOM 


BIL in Pairs of same Clutch. 


100 B/L 
53 | 54 55 56 | 57 


Index 


51 | 52 


50 


[ee] | 
ON 
| | as 
19 | | 
_ 

1D 


21 }15 [5 | 5 | 2 | 230 


| 29 | 375 26 19% 


46 | 47 48 | 49 


45 


Totals] 1 | 0 | 2 1 8 75 |13 | 6 


| = 20 
| 
J 
| 
O10 
4 
= 
| 
| 
. | | gaye 
| arr 
| | 
| 
\ 


TABLE AA. 


2 L/B—1 in Pairs of same Clutch. 


Rowan, K. M. Parker J. 


167 


| 

| 

| | 
} 

| 

\ 


168 


On Homotyposis in Eggs of the Common Tern 


TABLE BB. 
Girth L and Index 100 B/L in Pairs of same Clutch. 
Length—Girth of First Egg. 


S i=) S S ™ RX 
S S ~ ~ RX 
| —| —| -]- 0 
bo | 640-649) 2] 3 
= | 65-0—65-9 1 
| 66-0—66'9 —| 7 
|670—679] —| —| —| —|— 3] 1] 2] 1] — 7 
— |—|—|—| 1| 1] 8] 9] 1] — 15 
— | —| 1] 1] 9 
—|—|—| 6| 3| 6] 7] 6| 7|—] 35 
—|—| 1] 4| 7] 6] 4] 31 
—|—|1| 2| 4| 2] 3] 3] —|— 15 
790-799} 1] 2 
Totals | 1 | 2 | 3 | 26| 48 | 51 | 38| 3 | 232 


| 
3 
| 
wi 
| 
| 
| 
| 
| 
| 


MISCELLANEA. 


I. The Statistical Study of Dietaries, a reply to 
Professor Karl Pearson. 


By Proressor D. NOEL PATON, F.R.S. 


PROFESSOR PEARSON’S criticism of Miss Lindsay’s Study of the Diets of the Labouring 
Classes in the City of Glasgow (Biometrika, Vol. 1x. Oct. 1913) is a good example of the 
danger of one who does not understand the problems involved and who is ignorant of the 
work already done upon a subject attempting to discredit the results of an investigation by 
the application of mathematics according to his own fancy and in, what seems to me, a 
totally illegitimate manner. 


Not appreciating the questions which were under investigation, he starts his criticism by 
demanding that our studies should afford a solution of problems other than those we had 


before us, and, because he does not find the solution of these problems, he proceeds to abuse 
the work. 


Apparently in his opinion the object of the studies should have been to determine what 
effect the diets which the families were taking at the time of the study had upon the 
physique of the various individuals. He states that, if adequate anthropometric observations 
had been secured in such a study, it would have been at once possible to co-relate these 
with the diets. It is unnecessary to point out, as was pointed out in the Report, that the 
physique is determined by the whole previous condition of life and by the influence of 
heredity, and that it is absurd to attempt to relate it solely to the diet (Report, pp. 3 and 4). 


The objects of the studies are quite clearly stated on p. 4 of the Report: “Do the 
working classes of this city get such a diet as will enable them to develop into strong, 
healthy, energetic men, and, as men, will enable them to do a strenuous day’s work ; or are 
the conditions of the labouring classes such that a suitable diet is not obtainable? Further, 
if a suitable diet is obtainable, and is obtained, is it procured, or can it be procured, at a 
cost low enough to leave a margin sufficient to cover the other necessary expenses of the 
family life, with something over for those pleasures and amenities without which the very 
continuance of life is of doubtful value?” 


It was accepted as proved by previous work that for the labouring classes: “If a family 
diet......gives a yield of energy of less than 3500 Calories per man per day it is insufficient 
for active work, and if less than 3000 it is quite inadequate for the proper maintenance of 
growth and normal activity.” 

The first question investigated was: “Did the families examined receive this supply of 
energy?” As regards the poorest classes this was answered in the negative. The validity of 
this conclusion has not been challenged by Professor Pearson. 

Biometrika x 22 


. 
| 


170 Miscellanea 


The second question considered was whether the diets contained a sufficient supply of 
protein. Previous work indicates that this is probably something above 110 grms. per man 
per diem. It was shown that in families with regular incomes of over 20s. a week the 
average protein intake was above 110 grms., and that in families with regular incomes and 
in those with irregular incomes of under 20s. a week the average protein intake was under 
110 grms. This conclusion has not been refuted. 


Accepting our premises, the final conclusion was (p. 27) “that while the labouring classes 
with a regular income of over 20s. a week generally manage to secure a diet approaching the 
proper standard for active life, those with a smaller income and those with an irregular 
income entirely fail to get a supply of food sufficient for the proper nee and growth 
of the body and for the maintenance of the capacity for active wor' 


The main points proposed for the study were thus elucidated. 


The part of the Report to which Professor Pearson specially directs his criticism is not 
the main problem, but that dealt with on pp. 30 and 31—The Physique of Children in 
Relationship to Diet, a subject taken up at the suggestion of Dr Chalmers. Professor Pearson, 
having declared the data totally insufficient, proceeds to apply his statistical methods not to 
refute Miss Lindsay’s conclusion, but to demolish other conclusions upon the relationship of 
physique to income which were never deduced by us. 


The very guarded conclusion in the Report was: “These show very markedly the relation- 
ship between the physique and the food. When the weight is much below the average for that 
age almost without exception the diet is inadequate.” 


Weights alone were considered. Thirty-six children, boys and girls, were dealt with. As 
the relationship of weight to income was not under consideration, they were classified not 
according to the income but according to the energy value of the family diet. Hence 
Professor Pearson’s remarks upon this point are quite beside the mark. 


I give below, in a re-arranged form, the Table from Appendix IV. The individuals are 
placed in two groups according to the energy value of their diets, with, opposite each child, 
the average weight for the age, taken from the Report of the Anthropometric Committee 
published in the Transactions of the British Association for the Advancement of Science, 
1883, and with the difference between the weight of the child and the average weight. The 
differences between groups 1 and 2 are sufficiently marked and warrant the conclusion as 
stated above. 


That is, of the children in families the diets of which yielded more than 3000 Calories per 
man per day: 
10 were above the standard or not more than 5 lbs. below it, 
8 were more than 5 lbs. below it, 


while of the children in families in which the diet yielded less than 3000 Calories 
3 were above the standard or not more than 5 lbs. below it, 
15 were more than 5 lbs. below it. 


It must be remembered that the ‘standard’ is for the children of all classes and not for 
those of the poorer classes. 


The fact that the average age of the children in the second group was about 1} years 
greater than that of the children in the first group does not account for the marked 
difference, 


The last question which Miss Lindsay had to consider was, how the necessary supply of 
energy and of protein might be supplied without increased expenditure, and she was right in 
stating that these can be more cheaply purchased in vegetable than in animal foods. She 


Miscellanea 


TABLE A. 


Family Diets above 3000 Calories per Man per Day. 


Standard | 
Number} Calories Fo a Sex Weight Weight | Difference 
ears in lbs. | Ib 
in 1Ds. 

2 4003 7 39 = 74 
2 4003 10 63 455 
2 4003 8 a 50 549 | - 49 
2 4003 5 35 399 49 
36 4091 3°25 3 35 340 | - 10 
4 3882 8 45 52-2 - 72 
32 3822 6°25 Q 39 3-4 
— 3882 6 9 39 42-4 | = 3-4 
4 3882 10 3 56 67°5 -115 
39 3422 10°5 55 
50 3471 6°25 37 | 42-4 54 
50 3215 6 Q 47 424 | 4+ 46 
3116 6 43 | 49-4 | + 06 
18 3248 Bd Q 43 | 410 | + 20 
5h (3282 5 33 39°6 66 
| 458 | 3030 6 | 8 444 | 6-4 
80% 3136 ein 41 —~20°0 

| 49 33841 55 3 42 

| 


* Family with rickets. 


TABLE B. 


Family Diets below 3000 Calories per Man per Day. 


| Standard 
Calories Age | | Weight 
Number in Diet Sex in ite. Difference 
| | | 

14 2690 13 | 3 76 87 | -11°0 
2690 12 764 | —16-4 
14 2690 10 | 45°5 
15 2936 10 9 | 56 62°0 — 60 
17 2931 9°75 44 62:0 | —18-0 
55 2686 575 | Q 42 42-4 | — 0-4 
2690 9 | | 4 60°4 —15°4 
41 2723 675 | | 49°7 + 33 
14 2690 6 36 44-4 — 84 
2974 39°9 - 29 

3 2891 5 | 39°9 29 
42 2772 55 | 9 | & 41-0 - 70 
2412 11 | | 68°1 —29°1 
21 2329 9 | 375 55°5 — 18-0 
24 2412 6 | Q 28 42°4 —14°4 
21 2329 11 60 72°0 —12-0 
10 2435 ee 43 54°9 -11°9 
59 1978 5 | 3 | 26 39°9 -13°9 


171 


|| 
| 
| | 
| 
| 
| | 
222 


172 Miscellanea 


undoubtedly starts with the well-known conclusion that a Calorie in the food absorbed in a 
mixed diet from whatever source, protein, fat or carbohydrate is of equal dynamic value. 
Previous work amply justifies this. 


She was not foolish enough to attempt to draw any conclusion from her investigations as 


to the relative value of animal and vegetable food in the diets on the physical development 
of the individuals. 


Professor Pearson seems entirely unable to grasp the fundamental fact that the physical 
development of the individual depends largely upon his past conditions of life. To co-relate 
it with the special constituents of the food which he habitually eats will require not only an 
enormous series of studies, but a full investigation of the character of the various food stuffs 
and of the mode of cooking. 


These points I tried to explain to him when I wrote to him in summer. He did not 
write to me as, in his criticism, he says he did. Miss Lindsay forwarded to me a letter 
from him to her, and I wrote a reply to Professor Pearson which he did not acknowledge. 


In conclusion I would say that before he expects his criticism of a physiological problem 
to be taken seriously, he had better make some attempt to understand the nature of the 
problem. Certainly it is not my intention to waste time in replying further to his criticism 
unless in the future it is more pertinent than is his present contribution. 


II. The Statistical Study of Dietaries. A Rejoinder. 
By KARL PEARSON, F.R.S. 


I pusBLisH Professor Noel Paton’s reply because it is very typical of the type of difficulty 
which we meet with at present, when we assert that what is really statistical work must be 
undertaken only by the adequately trained statistician and that when it is not, then the 
investigation cannot be considered as falling into the field of science. 


Professor Paton states that the following question given on p. 4 of the Report formulated its 
object: “Do the working classes of this city get such a diet as will enable them to develop into 
strong, healthy, energetic men, and as men, will enable them to do a strenuous day’s work; or 
are the conditions of the labouring classes such that a suitable diet is not obtainable?”... 

Now Professor Paton either assumes that the sample taken of the diet of the individual 
family was their customary diet, or he does not. If he does, then the question: Was the diet 
such as would enable the working classes “to develop into strong, healthy, energetic men”? 
has meaning. If he does not, not only is it idle, but the section dealing with the physique of 
the children on the basis of a sample diet taken as a rule for a week (occasionally for a fortnight), 
is beside the point. 

But anyhow, I ask how he can possibly ascertain how the working classes will “develop into 
strong, healthy, energetic men,” if he does not take an adequate anthropometric survey of the 
families subjected to the dietaries recorded? He says that it is accepted and proved that “ If 
a family diet...gives a yield of energy of less than 3500 calories per man per day it is insufficient 
for active work ; and if less than 3000, it is quite inadequate for the proper maintenance of 
growth and normal activity.” He further assumes with Miss Lindsay that calories from animal 
and vegetable foods have equal “dynamic value.” I assert that neither of these conclusions, 


Miscellanea 173 


which he accepts, are based on adequate research and they are in fact refuted by Miss Lindsay’s 
own material. For, if it can be shown that animal and vegetable calories have different results 
on the physical development of the children, it is clear that the first statement as to how many 
calories are needful for the proper maintenance of growth has no significance until a statement 
is made with regard to the source of the calories. Professor Paton cites no evidence for his 
statements; from what I have read on the subject of calories, I feel convinced that most 
of the data on the matter would not stand for five minutes any adequate statistical analysis. 
The Report, Professor Paton tells us, shows “very markedly the relationship between the 
physique and the food.” Yet in a previous paragraph he says “that the physique is determined 
by the whole previous condition of life and by the influence of heredity, and that it is absurd to 
attempt to relate it solely to the diet.” 


Now the only way to ascertain whether there was a marked relationship between the food 
and the physique of the children was to correlate the two for a constant age and investigate 
whether the correlations were such, having regard to their probable errors, that they could be 
considered significant. I did this with the result that the total calories in the food and the 
girls’ weight for constant age was not definitely significant with regard to the probable error, 
while in the case of the boys the probable error was so large that it was impossible to say 
whether the relationship was really considerable or not. In fact no marked relationship could 
be deduced from Miss Lindsay’s data, they were too inadequate. If Professor Paton’s statement 
as to the influence of heredity is to be trusted, then even my correction for age was inadequate, 
and the data ought to be corrected also for physique of parent! If so, why was the parent not 
measured ? 


Professor Paton places before the readers of Biometrika two tables on which this “marked” 
relationship is asserted by him to rest. One of the cases in his Table A, No. 32, is erroneously 
placed in this table; the details show that the number of calories was 2949 and not 3822* ; 
it should be in Table B. These tables contain 16 boys’ weights and 20 girls’ weights. Professor 
Paton takes the British Association measurements, which are, of course, wholly inadequate as a 
test of Glasgow children, and making no real correction for aget considers whether the children 
in the two tables were or were not above the quite arbitrary limit of 5 lbs. below standard. He 
gives us no measure at all of the significance of the result, which is based on the vagaries of 
sampling 16 boys of ages from 3 to 11, and 20 girls from 5 to 13; and he supposes in some way 
that this treatment can possibly refute the correlation coefficient, g’mc,, of weight and food 
calories for constant age with its probable err. :! I can, however, throw more light on the 
matter. Owing to the great courtesy of Dr Chalmers, Medical Officer of Health for Glasgow, 
I have been able to more than treble the number of weights of the boys and girls subjected 
to the dietaries. The results for total calories in food, C,, now aret: 

Girls, 69 Boys, 55 
atuc,= +21 £08, aluc,= + 09. 
Thus the relation for boys is now quite insignificant, and for girls may well be insignificant 
also. At any rate although both correlations are positive, there is no “marked” relationship 
between the physique and the dietary. Of course, it may be said that these weights (w) have been 
taken at some interval after the dietaries were recorded, but unless we assume the dietary to be 
a rough measure of the permanent feeding of the family, whose physique has been gradually 
built up for years before the dietaries were recorded, the observations must be discarded as of no 
value at all for testing physique, or as Professor Paton phrases it “development.” 


* In the Appendix V of Rickety Families, it is given again ; this time as 2329 calories, 

+ The deviation at each age would have to be measured in terms of the standard-deviation of weight 
at that age; naturally the deviations are larger for older children. 

t I have to thank Miss B. M. Cave for the present series of correlations. 


174 Miscellanea 


But the most interesting point ascertained from the new material is the confirmation of the 
result that the higher the proportion of animal to vegetable calories the greater the weight. In 
Biometrika, Vol. 1x, p. 533, we had for 16 boys and 20 girls: 


Boys: alws Cp/C4= — ‘23 +°16, 

Girls : 12415. 
We now have for 55 boys and 69 girls: 

Boys : cy/c, = — 30+ 08, 

Girls: Cy/C4= — 244 08. 


These results seem to indicate that Miss Lindsay and Professor Paton, who supports her view, 
are in error when they consider a calory the same whether it be from animal or vegetable food. 
On the other hand, our larger numbers now indicate that : 


(i) For a constant age the expenditure on vegetable or on animal food has no sensible relation 
to weight. 


(ii) For a constant age the number of calories in vegetable food has no sensible relation to 
weight. 


(iii) For a constant age the number of calories in animal food has a positive correlation with 
weight for both girls and boys, being definitely significant in the first case (+°32 +07) and not 
so in the second + ‘09). 


(iv) For a constant age the correlations of weight with ratio of expenditure on vegetable and 
animal foods are for both boys and girls quite insignificant as compared with their probable 
errors. 


I am extremely obliged to Dr Chalmers for doing his best to supply additional material. As 
far as it goes, it tends to show that calories are of far more importance than expenditures, but 
that calories from animal food are more closely related to physique than are calories from 
vegetable food*. The new material supports my criticisms that the failure to distinguish 
between animal and vegetable calories stultified the advice given by Miss Lindsay, i.e. to spend 
money on oatmeal rather than on eggs. It also indicates that no safe conclusions with regard 
to dietaries can be drawn until a reasonable anthropometric survey accompanies the record 
of dietaries, and the whole is reduced with adequate statistical knowledge. 


One point I can allow Professor Paton. It was an oversight on my part, when I said that 
I had written to both Miss Lindsay and to himself; the letters in which Miss Lindsay and he 
stated that to follow up the families now would be impossible were both replies to one and the 
same letter of mine addressed to Miss Lindsay. The additional facts I desired were in their 
opinion unascertainable, and further correspondence did not seem to me likely to be of any 
service in achieving the end I had in view, namely to render of real service to science a piece of 
recording work from which in my opinion then and in my opinion still, very misleading conclu- 
sions had been drawn, and which conclusions in their turn had been exaggerated in the press 
résumés of the paper. I do not think any such work as that done on dietaries by Miss Lindsay 
and Professor Noel Paton will be of real value until (i) these dietaries are accompanied by 
a thorough anthropometric survey of the whole families of the dieted and (ii) the equality of 
animal and vegetable food calories ceases to be considered as a dogmatic truth. 


* Of course the results show that on such data as are available, the food has relatively little relation 
to the weight, there is no ‘‘marked ” relationship. 


Miscellanea 175 


III. Note on the essential-Conditions that a Population breeding at 
random should be in a Stable State. 


By K. PEARSON, F.R.S. 


Let us deal with bi-parental inheritance in the first place. Let x be a character in the father, 
mean %, standard deviation o,; let y be the same character in the mother, 7 its mean, and az its 
standard deviation. Let z be the character in offspring of one sex, o3 be the standard deviation of 
all offspring of this sex and Z the mean. Let ps’, pos’ 3 os”, wea” 5 aN pry’”, be the 
moment coefficients about the means respectively of father, mother and offspring frequency distri- 
butions. Let 7, be the mean of the offspring of those parents, who have characters x and y, and 
let the array of frequency of such offspring be given by /; (w) du about 2,,, i.e. the character of any 
offspring in this array is 7,,+, where wu is independent of the parental characters # and y, but 
2,, is a function of # and y the parental characters. Some writers have suggested that the 
offspring character should be taken as a blend of the parental characters, i.e. 


understanding by blend the mean of the parental characters. This appears to be very unsatis- 
factory for : 


(a) It supposes the parental characters to fix absolutely the offspring characters which is far 
from a result of experience. 


(6) It supposes the mother to reproduce the female size of character in the male and the 
female offspring alike, whereas she contributes to each the sex character of her own stock, i.e. if 
she is a tall woman, she would contribute absolutely more to a son than to a daughter. The late 
Sir Francis Galton got over this difficulty by “reducing female measures to their male equiva- 
lents.” This he did by altering absolute measurements in the ratio of male to female mean 
measurements. Thus he would take for the mean of his array of offspring 


(2+ 59) 
if he were dealing with male offspring. A more reasonable hypothesis is to assume that 


This will practically agree with Sir Francis’s form, if the coefficients of variation in the two sexes 
are the same, i.e. o)/%=02/7. 


If we measure wv from the mean of the array of offspring we have 


02 
We shall now suppose the offspring to follow the law (i), or 


a2 


where x and y are uncorrelated (mating at random), and w represents other influences than the 
parental, and is therefore uncorrelated with # and y*. The frequency distributions of # and y 


* This assumes the homoscedasticity of the arrays of offspring due to pairs of fathers and mothers 
with characters x and y. 


| 


176 Miscellanea 


may be taken as given by f, (w—%) and f.(y—J). Let N,x N, be the total number of possible 
matings 
dedy 
and the total number of offspring V3 in any array 
=f fs (wu) du. 


I now propose to give the expression for the nth moment coefficient about the mean, i.e. p,’”, 
of the population of offspring of a ieitsions sex. We have ‘ 


Ny x Ng pon” = {ou ( =!) 4 (w— 2) fe (y 9) x fy (u) 


the integration being extended over the whole of the frequency distributions of father, mother 
and offspring. Thus 


t=n-s {} (w-2)} n-8— w’ 
xfi fo (y— fa (u) dadydu. 
Now «, y and wu being independent we have 


1 


1 
¥, [uf du=p,'. 


gn- -3 |n— 8|s oy! 
Thus we reach, remembering that jy’ = py” =0, 
m _1 Bs, 
1 pa py pe” 
=76 % (A+ og +5 Bal’ + py (vii) 
But po’ pe” =o", and Hence we must have 


If as usual we take B= 5"/y2? and 8,=p4/ys" we find from (vi) and (vii), writing s?= pyi¥ 


Whence by the use of (viii) 


Miscellanea 177 
Hence in order that the offspring population should be stable, it is needful that in the array of 
offspring for given oases 
(a) s= 
1 a? 7 ” 
b) =2,/2 + (1-7) = var, 
if =B,' =B,", i.e. the skewness be the same for fathers, mothers and 
=} (78s"-15), 


if =B,". 
Thus, we have for the array of offspring of given parents 


-3=1 (8,"-3) 


Accordingly the variability of the array is less than that of the population of offspring ; and 


the array (unless 8,;""=0, 8;""=3) is more skew and has greater kurtosis than the general 
population. 


If ry2, 723, 73, be the three correlations of father, mother and offspring we know that the mean 
standard-deviation of the offspring of arrays having the same parents is 


3 2 

1-13 
and this equals if there be no assortative mating 


(r12=0), 237. 
If we could assume this equal to s we must ra since 


= 35 


leading to 
or if the two parental correlations are equal to 
113 = 193 = 
In other words, if the parental influences were equal and there were no assortative mating 
and the character in the array of offspring had the mean value 


then the population could only be stable if 
733 =f 23> 0°5. 


But this apparently noteworthy result only begs the question. By the general theory of 
correlation the mean of the array of offspring is 


if there be no assortative mating, 
02 


Biometrika x 23 


1 
9 see 
B,"=5B, | 
P 
1 — — 723°, 
1 
9% 
=i +05 (ms 
02 


178 Miscellanea 


Hence if we assume the mean of array of offspring to be given by 


(i) the second portion of the expression must be zero, i.e. mean of whole population of 
offspring must coincide with mean of array of offspring where parents have the mean values and 


(ii) we must have 713=72;=4. In other words the form of our assumption involves both the 
equal influence of the parents and the value of the parental correlation. 


From the standpoint of heredity no such assumption is legitimate. Neither in Mendelian 
theory nor in biometric formula, nor again in actual observation is it permissible to suppose that 
the mean of the array of offspring is determined solely by the parents. Still less is it possible 
to suppose the actual character of the offspring to be the mean of that of the parents (i.e. put 
u=0). If it were we should have z=}(a#+y), whence flow 


1 ” 
= + pe") 

1 ” ; ) 


i 1 ” ” 
= (Ha! + +14”) 


But these equations assume that py’, p3'Y and py” are all zero—an absurdity in itself and 
contrary to all experience, whether biometric or Mendelian. For non-assortative mating and 


equal potency of parents, they lead to parental correlations of the order -7 and to an impossibility 
of stability in any population*. 


In fact any such relations as (xiv) are inconceivable on the basis of both biometric as well as 
Mendelian theory and observation. Parental correlations have never been observed anywhere 
near such a value as 0°7. Equations (xiii) are, however, suggestive ; they show that if the 
parental distribution be symmetrical and mesokurtic, the array of offspring will remain so after 
selection ; but if the parental distribution does not possess these characters, then any selection of 
individual parents will emphasize the asymmetry and the kurtosis in the resulting array of 
offspring ; or continued selection of this type will lead to greater and greater divergence from the 
normal or Gaussian frequency distribution. 


* If we assume that the mean of the array of offspring of parents of characters x and y is given by 


lx + my, it is only another way of asserting that the regression is linear and that 
1-193? = 1- 793? 


If we make /=m, or give equal weight to the parents, it is only rational to suppose that o;=02 and 
T12=1}3, Which lead us to 


93 
Hence the mean of the array is ue 


and whether we make x constant and y constant or x+y constant leads to precisely the same variability 
in the array, i.e. 


1- Tez 1+ 23 
If assortative mating be zero, this equals 
o3 1- 2r 
and, if to reach the results for y:"” given above we put this zero, we must have 


N50=0°7 nearly. 


5 


Miscellanea | 179 


IV. The Elimination of Spurious Correlation due to position in Time 
or Space. 


By “STUDENT.” 


In the Journal of the Royal Statistical Society for 1905*, p. 696, appeared a paper by 
R. H. Hooker giving a method of determining the correlation of variations from the “in- 
stantaneous mean” by correlating corresponding differences between successive values, This 
method was invented to deal with the many statistics which give the successive annual values 
of vital or commercial variables; these values are generally subject to large secular variations, 
sometimes periodic, sometimes uniform, sometimes accelerated, which would lead to altogether 
misleading values were the correlation to be taken between the figures as they stand. 

Since Mr Hooker published his paper, the method has been in constant use among those who 
have to deal statistically with economic or social problems, and helps to show whether, for 
example, there really zs a close connection between the female cancer death rate and the quantity 
of imported apples consumed per head ! 

Prof. Pearson, however, has pointed out to me that the method is only valid when the 
connection between the variables and time is linear, and the following note is an effort to extend 
Mr Hooker’s method so as to make it applicable in a rather more general way. 

If x1, #3, etc., 71, etc., be corresponding values of the variables and y, then if 
Ly, Ly, Xz, ete., Yi, Yo, Ys, ete. are randomly distributed in time and space, it is easy to show that 
the correlation between the corresponding zth differences is the same as that between . and y. 

Let ,,D, be the ath difference. 

For - 2x, 

Summing for all values and dividing by V and remembering that since 7, and 2, are mutually 
random S (7, #2) =0, we gett 

= 20." 

Summing for all values and dividing by V, and remembering that 7, and yz and 2, and y, are 
mutually random 


Proceeding successively DegDy eq (1). 


Now suppose 2), 72, #3, etc. are not random in space or time; the problems arising from 
correlation due to successive positions in space are exactly similar to those due to successive 
occurrence in time, but as they are to some extent complicated by the second dimension, it is 
perhaps simpler to consider correlation due to time. 

where X,, X,, etc. are independent of time and ¢,, fz, ts are successive values of time, so that 
t, —t,,= 7, and suppose y, = ete. as before. 


* The method had been used by Miss Cave in Proc. Roy. Soc, Vol. ixxiv. pp. 407 et seq. that is in 
1904, but being used incidentally in the course of a paper it attracted less attention than Hooker’s 
paper which was devoted to describing the method. The papers were no doubt quite independent. 

+ The assumption made is that n is sufficiently large to justify the relations 

Sy"-1 (x)/(m — 1) = Sy" (x)/(n 1) = S$," (a)/n and (x?)/(n — 1) = Sy" (x?) /(m 1) = (x®)/n, 
being taken to hold. 
23—2 


180 Miscellanea 


Then 1D,=,Dy-bT —cT + te) dT + ty te + — ete. 
1D, + ete.} — + 4eT? + etc.} 
—t,2{3dT + 6eT? + etc.} — ete. 
In this series the coefficients of ¢,, t2, etc. are all constants and the highest power of 4, is one 
lower than before, so that by repeating the process again and again we can eliminate ¢ from the 
variable on the right-hand side, provided of course that the series ends at some power of t. 
When this has been done, we get 
+a constant, 
nD,y=,Dy +a constant, 


=7 =Pyy 
"Dx Dy =? 
and of course 7 D4, Dy=",Dz,Dy> fF Dz and ,D, are now random variables independent 
of time. 


Hence if we wish to eliminate variability due to position in time or space and to determine 
whether there is any correlation between the residual variations, all that has to be done is to 
correlate the Ist, 2nd, 3rd...2th differences between successive values of our variable with the 
Ist, 2nd, 3rd...nth differences between successive values of the other variable. When the cor- 
relation between the two zth differences is equal to that between the two (n+41)th differences, 
this value gives the correlation required. 


This process is tedious in the extreme, but that it may sometimes be necessary is illustrated 
by the following examples: the figures from which the first two are taken were very kindly 
supplied to me by Mr E. G. Peake, who had been using them in preparing his paper “The 
Application of the Statistical Method to the Bankers’ Problem” in The Bankers’ Magazine (July— 
August, 1912). The material for the next is taken from a paper in The Journal of Agricultural 
Science by Hall and Mercer, on the error of field trials, and are the yields of wheat and straw on 
500 45 acre plots into which an acre of wheat was divided at harvest. The remainder are from 
the three Registrar-Generals’ returns. 


I Il III IV V VI 
Correlation between ... Sauerbeck’s | Manriage | Yield of | Tuberculosis Death Rate. 
|Index numbers.| Rate | Grain | 
Infantile Mortality 
and ae ... | Bankers’ Clear- | Wages | Yield of 
per | | Ireland England | “Seotland 
| 
Raw figures — 33 | — +°753 +°63 +°35 
First difference +°51 | +°590 +°75 +°69 +°51 
Second difference .. + °30 | +°58 + 539 +°74 +°74 +65 
Third difference ... +07 | 4°52 
Fourth difference ... +11 | + 
Fifth difference | 458 -- | 
Sixth difference — = 
| — | | 
Number of cases | 41 years 57 years 500 42 years | 
| y plots 


The difference between I and II is very marked, and would seem to indicate that the causal 
connection between index numbers and Bankers’ clearing house rates is not altogether of the 
same kind as that between marriage rate and wages, though all four variables are commonly 
taken as indications of the short period trade wave. I had hoped to investigate this subject 
more thoroughly before publishing this note, but lack of time has made this impossible. 


Miscellanea 181 


V. On certain Errors with-regard to Multiple Correlation occasionally 
made by those who have not adequately studied this Subject. 


By KARL PEARSON, F.R.S. 


(1) Iv is well-known* that if we endeavour to predict the value of a variate x from x 
correlated variates 21, 2.2, ... %,, by determining a linear function of %, %2, ... 2, which has 
the maximum correlation 2, with «), then the value of #,? is given by 

R,2=1—A/dq9, 
where A is the determinant 
A=| 1 » 029 Ton 
| Tuts Tad, 1 
and A,, is the minor corresponding to the constituent of the pth column and gth row. 
The system I propose to consider is that in which all correlations like 79, are equal, whatever 


p be, to a constant p, and all correlations 7,,, where p and g may take any values from 1 to a, 
are the same and equal to «. We now have for the value of A the expression 


| 1, Py Py P 


| 


To evaluate this determinant add all the rows but the first together, giving 
np, 1+(n—l1l)e, 14+(n—l)e, ... 14+(n—l)e, 
multiply the result by p/(1+(a—-1) e) and subtract from the first row. We have 


nd 
Ps 
P» € | 
Ps | | 
Hence R,2=1 (1 (i), 


/ n 


Hence proceeding to the limit we have 


* Biometrika, Vol, v1. p. 439. 
+ The sign of R,, must be determined from other considerations. 


| 

| | 

{ 


182 Miscellanea 


Thus if 2 variates are equally correlated (e) among themselves, and equally correlated (p) with 
another variable, we shall not indefinitely increase the accuracy with which the last variable will 
be predicted from the others by increasing indefinitely the number of the variates n. 


illustration. The coefficient of multiple correlation is required as we increase the number of 
brothers from whom a prediction of a character in a given brother is made. The fraternal 
correlation =°5, 


Number of Brothers R, 
1 “5000 
2 “5774 
3 “6124 
+ “6325 
5 6455 
6 6547 
10 6742 
‘7071 


Compare against these results two parents only in a population where there is no assortative 
mating and the parental correlation="5. Here e=0, and n=2, R=4,/2=°7071, or two 
parents will give more information than 10 brothers and sisters, and as much in fact as an 
indefinite number. Suppose the parents tend to select their like, ie. suppose there is assor- 
tative mating in the population, say, e=*15, then with the same intensity of parental correlation 


R=-6594, 
or, two parents will give us more information than six brothers and sisters. 


Now this illustration brings out the real nature of the effect of increasing the number of 
variables from which we predict. Such increase has very little value, if those variables are 
fairly highly correlated with each other. To be effective they must be highly correlated with 
the variate we wish to predict and correlated very slightly with each other. 


Even in this case there is a limit to the degree of correlation reached when the number of 
variates is indefinitely increased, namely p/,/e, and it is clear that if p be small and ¢ fairly large, 
no very great increase of correlation is obtained if we use aa indefinitely great number of variates. 
For example if p="05 and e=*5, we find 2, =-0707 only. Even if p were ‘10, we should only 
raise 2 to ‘1414, could we predict from an indefinitely large number of such correlated variates*. 
Indeed as long as ¢ is not less than p we gain singularly little by combining large numbers of 
variates, For example if p were ‘4, and e=°4 ten such variates would only raise the correlation to 
5898, and an indefinitely large number to ‘6325, which is less than double the single correlation. 
Yet there are apparently many persons who believe that by taking a number of low correlations, 
a high relationship can be reached ! 


Actually there is a limit to what relations can possibly exist between a variate 2) and a series 
of equally correlated variables , ... z,. Since & must be less than unity, we have 


PA/ 


np* —1 


or e> 


Thus if n=10 and p=°5, « must be >'1667. Or, it would be impossible for 10 variates to 
have a correlation ‘5 with another variable, and a zero correlation with each other. 


* Even if p were *10 and ¢ as low as 10 we should not raise R for endless variates of this order of 
correlation above -3163, while from compounding ten such variates we should only obtain a correlation 
about double that of a single variate, i.e. R=*2294, 


Miscellanea 183 
If we suppose a nuwuber of variates » to be uncorrelated with each other, but correlated 
“on With another variable then we have from the determinant as given below 


A=| 1 » Tots Tez, Ton = (1 — 1p? — — ... Ton”) Aoo- 


+7? + +1 on", 


Pont 


Therefore, if n variables, uncorrelated among themselves, be correlated with an additional 
variable, it is necessary that the root mean square of their correlations should be less than 
Fa We see therefore that it must either be impossible to find a large number of variables 
uncorrelated among themselves, which are correlated with an additional variable, or else their 
correlations with this variable must be extremely low. The last result shows us the fallacy 
of supposing that correlations are simply added together for a combined effect ; clearly when 
the variates are uncorrelated among themselves, we add by the swm of the squares. For 
example, if 7,=792=...=?on='03 one hundred such variables would only raise R to 30. On 
the other hand if the variates are highly correlated together, say e=°81, an indefinitely great 
number of such variables would only raise the multiple correlation to ‘0333, if the individual 
correlation were ‘0300. 


We are now in a position to apply our results to the problem of the relative intensity of 
heredity and environment. This problem has been singularly misunderstood especially by the 
popular exponents of Eugenics. Some illustrations of this may be given here. Major Leonard 
Darwin writes as follows in the Journal of the Eugenics Education Society: “It is impossible 
to compare heredity as a whole with environment as a whole as far as their effects are 
concerned ; for no living being can exist for a moment without either of them*. Moreover, 
in order to compare two things so as to be able to use the words more or less in connection 
with such a comparison, we must have a common unit of measurement applicable to them 
both. But what is the unit by which both heredity and environment may be measured ? 
I myself have no idea. May we not be discussing questions as illogical as enquiring what 
portion of the area of a rectangle is due to its width and what to its length? Js it ever wise 
to use words in scientific literature without endeavouring to attach a definite meaning to themt ? 


It is hard to conceive a paragraph of the same length more full of evidence of complete 
ignorance of the methods used in modern science for comparing correlated variates! Yet it 
goes out as the opinion of the President of a Society which is endeavouring to spread the 
scientific doctrines of Eugenics among the people! Major Darwin begins by stating that it is 
needful to have a common unit of measurement in order to compare two variates. To begin 
with we are not comparing two things, but we are comparing the influence of two things on 


* There would in our sense be no heredity if the average child born to noteworthy parents was equal 
to the average child of the whole community. Yet it is perfectly easy to understand how living beings 
could exist under such a law of reproduction. Major Darwin seems to be confusing two things, the fact 
that a man is born true to his species, and the fact that he resembles his immediate ancestry. It is 
the latter fact only which concerns us when we compare heredity and environment, i.e. how variation of 
immediate ancestry affects the individual’s physical or mental characters. But without such heredity 
individuals might quite well exist. ; 

+ The Eugenics Review, Vol. v. p. 152. The italics are mine. 


be “Pet | 


184 Miscellanea 


a third, i.e. the intensity of a certain environmental influence and the intensity of a certain 
somatic character in the parent, say, on the intensity of the somatic character in the off- 
spring. Yet Major Darwin tells us we cannot do this because we cannot measure these 
things in the same unit !—How suavely yet forcibly Sir Francis Galton himself would have 
ridiculed such ignorance in high places as is passed by the Editor of the Eugenics Journal !— 
We can hear him now telling us how the intensity of each character could be measured by 
its grade, and how the problem turned on whether the same change in grade in the environ- 
ment and in the parental somatic character produced greater or less change in the grade of 
the filial somatic character. When we inquire whether inter-racially stature is more closely 
related to cephalic index or to eye colour, are we to be met by the statement that these 
characters cannot be compared because they cannot be measured in a ‘common unit,’ and 
then be told that it is not “wise to use words in scientific: literature without endeavouring to 
attach a definite meaning to them?” Every trained statistician knows that each character 
is measured in the unit of its own variability—in what he terms its standard deviation*, 
and that this standard deviation provides him with a measure of the frequency of each value 
of the variate in question. It seems to me that the only correct sentence in this paragraph, 
is the author’s statement that he himself has no idea what unit is ‘common’ to heredity and 
environment. 


But our author continues : 


“Take any quality, and we find that the human beings composing any community differ 
more or less considerably as regards that quality. Now we can measure the correlation 
between the differences shown in this quality and the differences of environment to which 
the members of the community in question had previously been exposedt. This is one 
correlation. Then we can also measure the correlation coefficient between, say, father and 
son, as regards the quality in question. Here is a second correlation; and if we are told 
that the relative influence of environment and heredity is measured by the ratio between 
these two correlation coefficients, we certainly do thus get a clear conception of what is 
meant 


But has the writer really obtained a clear conception of what such coefficients of correla- 
tion mean, when in the next paragraph he continues : 


“Tmagine an ideal republic, in some respects similar to that designed by Plato, where not 
only were all the children removed from their parents, but where they were all treated exactly 
alike. In these circumstances none of the differences between the adults could have anything 
to do with the differences of environments, and all must be due to some differences in inherent 
factors. In fact the environment correlation coefficient would be nil, whilst the hereditary 
correlation coefficient might be high §.” 


Could any better evidence be adduced that the President of the Eugenics Education Society 
did not know what a coefficient of correlation meant at that date? The coefficient of correlation 
for the environment might be anything from —1 to +1; the only obvious fact would be that you 
could not find its value, except in the form 0/0, from an environment which precluded any 
measure of variation. How again Sir Francis would have smiled at the notion that the 
coefficient of correlation for a constant environment must be nil. Why should we follow such 


* Of course he may or does need other constants to help in the description of the frequency. 

+ loc. cit. p. 153. 

+ This seems to contradict the writer’s previous assertion that two things are incomparable, if they 
have not a ‘common unit’! 

§ I wrote at once to Major Darwin poiating out the error of such a statement and he withdrew it in 
the next number. But the harm done by an article of this kind cannot be reversed by correcting a 
single misstatement. 


Miscellanea 185 


advice as that given by the President of the Society to avoid as far as possible “such phrases as 
the relative influence of heredity and environment,” when on his own showing he does not in 
the least appreciate the methods by which this relative influence is measured ? 


Then Major Darwin continues : “ Surely what we want to know is how we can do most good— 
whether by attending to reforms intended to affect human surroundings, or to reforms intended 
to influence mankind through the agency of heredity. But does this ratio [that of the environ- 
mental and hereditary correlation] give us any sure indication of the relative amount of attention 
which should be paid to these two methods of procedure?” Our only reply can be that these 
correlations certainly do, and that as long as the President of the Eugenics Education Society 
fails to grasp their meaning, he is doing grave harm to the science of eugenics. 

We measure the change in the character of an individual which would be produced by a 
change of a like or an allied character in a parent, such change being one of which we have 
experience ; we measure the change which would be produced in the character of the individual 
by changes in the environment such as we have experience of, i.e. when we move the individual 
from a badly ventilated to a well ventilated house, from a back to back to a through house, from 
a low wage to a high wage, and so forth, and we find the resulting changes are of a wholly 
different order in these cases to what happens when we change the physical characters, the 
health or habits which define the parents. It is on the basis of this that we assert that the relative 
strength of heredity is far greater than the strength of environment. To this reasoning, apart from 
such arguments as the above or those to be immediately dealt with, reply is only made by talk as to 
the impossibility of an individual surviving if you deprived him of his normal environment! It 
would be just as reasonable to assert that everything must be due to heredity, because a race of 
supermen would breed supermen! What the scientific eugenist has endeavoured to measure are 
the influences of such range of differences in environment as occur in everyday experience and 
are therefore producible from the political, economic and social standpoints, not the absence of 
all environment at all. But while this is recognised by some of the popular eugenic writers, they 
have approached the problem from another standpoint which indicates equally how little they 
grasp modern statistical theory. We admit, they say, that the environmental correlations may 
be of the order ‘03 or ‘05 and the inheritance correlations of the order ‘50. But this is the 
correlation of one character in environment. You ought to take ten or twenty, and then you 
will have multiplied up environment to be more effective than heredity, for 03 x 20=-60. In the 
first place we may suggest that'it would be just as reasonable, if the argument were a valid one 
to multiply up the favourable hereditary characters, to take weight, height, muscular activity, 
health, intelligence, caution, and many other desirable factors, and these not only in one parent 
but in brothers, sisters, aunts, uncles and grandparents and treat the cross-correlation of these 
with the character under discussion. But although every improvement in stock would reflect 
itself in improvement in offspring, correlations cannot be added together—any more than forces 
by simple arithmetical addition. You do not combine two hereditary correlations any more than 
two environmental correlations by mere addition. You must proceed by the combinatory process 
indicated at the commencement of this paper, which is one of course familiar to every trained 
statistician. 

Yet here is a statement which the Editor of the Zugenics Review admits to its pages without 
contradiction * : 

The point that we wish to make is this. In the face of so much ignorance concerning, not only 
heredity itself, but also its complement, the influence of environment, how can any one be justified in 
making sweeping generalisations with reference to these subjects ? 

Such generalisations, however, are made. It is said that we have a definite proof that inheritance is 
of far greater strength than environment. This argument takes the following shape. The correlations 
between parent and offspring for a number of features have been calculated, and the mean is found to 


* Vol. v. p. 219, in an article by A. M. Carr-Saunders. 


Biometrika x 24 


4 


186 Miscellanea 


be somewhere about -5. Correlations between individuals and various aspects of their environment have 
also been worked out—as, for instance, mental ability and conditions of clothing, or between myopia and 
the age of learning to read*—and the mean value is found to be about -03. It is then said that the 
mean “‘nature value” is at least five to ten times as great as the mean ‘‘nurture value,” and upon this 
is founded the generalisation that ‘“‘nature” is of far greater importance than ‘‘nurture”+. It may be 
questioned, however, whether such a comparison does not involve a serious misiake. For if we consider 
the two mean values that are compared, we find that, whereas the ‘‘ mean nature value” is the mean 
value of a number of observations, all of which provide a full measure of the strength of heredity, the 
‘*mean nurture value” is the mean value of a number of observations, each of which measures only the 
strength of some one isolated aspect of environment. It would appear then that the full strength of 
inheritance has been compared, not with the full strength of environment, but with the average of a 
number of small isolated aspects of the latter. Asa matter of fact it is quite beyond our power at 
present to sum up the full effect of environment upon the individual and compare it with the full effect 
of heredity. We are, therefore, justified in saying that we neither know in particular cases how far the 
environment can produce any effect, nor can we make any definite statement as to the comparative 
strength of “ nature” and “ nurture.” 


Now this is the doctrine passed by the Editors of the Hugenics Review, the journal of a 
society, which has assumed the mantle of Francis Galton{, and it is passed, because the 
editorial committee of that society does not grasp the meaning of multiple correlation! The 
passages in italics have been so printed to draw our readers’ attention to them. In the first 
place, of course, a single correlation coefficient does not provide a full measure of the strength 
of heredity. In the table cited the coefficients are those for one parent or for one brother or 
sister. Each relative—and those for independent stocks are either non-correlated or inter- 
correlated very slightly—provides such a coefficient, and further each character in such relatives 
may be correlated with the character under discussion in the subject in question. In the) next 
place the environment factors do not consist of “some one isolated aspect of environment.” 
All these factors or aspects are closely interlinked, and this was a fact well-known to the 
workers in the Galton Laboratory. The real interpretation of such a difference as ‘50 and ‘03 
in the average values of single coefficients can only be appreciated by those who are conversant 
with the theory of multiple correlation, and it is quite clear that those who profess to guide the 
public in this very difficult problem—which is essentially a scientific problem—lack any adequate 
knowledge of the sole instrument by which any conclusion can be drawn. 

The writer appears to be wholly ignorant of the nature of multiple correlation in the first 
place, and in the second entirely to overlook the very high correlations which exist between 
environmental: factors. Bad wages, bad habits, bad housing, uncleanliness, insanitary sur- 
roundings, crowded rooms, danger of infection, etc., etc. are all closely associated together, 
and while the order of correlation between environmental and physical characters is low, that 
between individual environmental factors is in our experience very high. Thus the problem of 
multiple correlation illustrates closely the theory developed in the first part of this note; we 
have to deal with a low p and a high e. 


For example, if we take the environmental factors to have an average inter-correlation of °70, 
then an infinity of such factors for a mean environmental and individual correlation of -03 would 

* As the writer phrases this correlation, it is very liable to be misinterpreted. What the Galton 
Laboratory did was to show that myopia was very markedly inherited, and that the theory that it was 
largely due to school environment was incorrect, because children who began to read late, i.e. went late 
to school, were not less myopic than those who went early. 

+ Karl Pearson, Nature and Nurture, Eugenics Laboratory, Lectures vi. p. 25. 

} If there was one point on which Francis Galton felt strongly and wrote it was on this point of the 
relatively great intensity of ‘‘nature” as compared with “nurture.” I do not stand alone in recognising 
it as an essential part of his teaching: ‘I am inclined to agree with Francis Galton,” writes Charles 
Darwin, ‘‘in believing that education and environment produce only a small effect on the mind of 
anyone, and that most of our qualities are innate.” 


: 
| 
| 
| | 
| 
i 


Miscellanea 187 


only raise the correlation to ‘0359 against a single parental correlation of 5000; if the correlation 
was ‘05 instead of ‘03, we should have the total possible environmental multiple correlation ‘0598 
as against ‘5000. Even if we raise the average environmental correlation to ‘1 and the inter- 
environmental factor correlation be reduced to ‘5, the multiple correlation of an infinity of factors 
is only ‘1414 as against the single factor of heredity ‘5000. Even if we could pick out one 
hundred environmental factors which had no inter-correlations—which experience shows is 
wholly impossible—and each of these independent factors was correlated to the extent of -05 


with the mental or physical characters of an individual they would only just reach the hereditary 
influence of a séngle character in a single parent. 


Now let us suppose an absolutely idle case, namely that the environmental factors had the 
same correlation as a parent, i.e. “5, with the character of the individual, and only a correlation 
of ‘6 with each other, then if we could use an indefinitely great number of such factors the 
multiple correlation would only be -5/./°6=-6455, while the correlation with two parents, with 
no assortative mating, would be -7071. Even with assortative mating, it suffices to take only 
the four grandparents into account to show that heredity acts in excess of an environmental 
scheme even so preposterous as is suggested above. If we take the parental correlations ‘50, the 
grandparental *25, and those of assortative mating *15, we have for the determinant: 


A=! 1, ‘30, -25, -25, -25, -25 
50, 1, ‘15, ‘50, 50, 0, O 
50, ‘15, 1, 0, 0, 50, “50 

0, 


25, °50, 1, “15, 0 


| « 0, 
“BO. 

| 


1 
25, 0, ‘50, 0, O, ‘15, 1 
Add together the second and third rows multiplied by 3951, and the fourth, fifth, sixth and 
seventh multiplied by 0456 and subtract the result from the first. The first row then becomes 
| 5593, 0, 0, 0, 0, 0, 0} 


the others of course remaining the same. 


Hence A = x Ago, 
and R2=1— A/An=1 —*5590= "4407. 
Therefore R="6639. 


Or together grandparents and parents would influence a man’s character more than an 
infinity of environmental factors of the same grade of correlation, because the latter factors 
are far more highly correlated together than several of our relatives. 


Actually of course we are dealing with average values; the average value of environmental 
correlation with individual character being in our experience of the order 03 to ‘05 and the 
inter-environmental factor correlations of the order ‘5 to ‘7. But these averages enable us to 
appreciate the total effect. 


The doctrine taught by the writers in the Eugenics Review, that we know nothing of the 
relative intensity of environment and heredity and that it is unwise “to use words in scientific 
literature without endeavouring to attach a definite meaning to them” only demonstrate how far 
the Editors of that Journal are removed from any appreciation themselves of modern statistical 
methods. How far the doctrine is removed from the very strong views held on this point by 
Francis Galton, only those who have studied his writings and know how strongly he felt person- 
ally on the subject are in the least competent to appreciate. 


24—2 


188 Miscellanea 


VI. Formulae fer the Determination of the Capacity of the Negro 
Skull from External Measurements. 


By L. ISSERLIS, B.A. 


§ 1. Formulae for the determination of the capacity of the human skull from external 
measurements, were obtained by Lee and Pearson*. The material they employed consisted 
of various series of measurements of Bavarian, Aino and Naqada skulls. Measurements of 
Ancient and modern Egyptian and other non-European skulls were employed, chiefly for 
purposes of comparison. The formulae, some of which will be quoted later, were intended 
primarily for the prediction of the capacity of European skulls, from external measurements. 
Doubt has been thrown on several occasions on the applicability of these formulae to the Negro 
skull, one of the reasons alleged being the supposed difference in thickness of the bone of 
European and Negro crania. 

The publication+ of the late Dr R. Crewdson Benington’s researches on the negro skull has 
made it possible to obtain similar formulae for negro skulls, and to test how far these can 
be applied to the prediction of the capacity of European skulls and conversely to test the 
applicability of Lee and Pearson’s Equations to the negro skull. 


§ 2. The material is fully described in Dr Benington’s Study. The crania dealt with in 
the present paper are Benington’s series A, B, C. 

A. Congo Crania in the Royal College of Surgeons. These crania provide 46 males and 
and 21 females, as owing to various defects no capacity is available for numbers 25, 38, 48, 54 
among the males and numbers 69, 72, 75, 79, 82, 85 among the females. 

B. Crania from the Gaboon, Group I, brought by Du Chaillu from Fernand Vaz in 1864. 
Of the 50 male and 44 female crania in the series, 2 males (numbers 3 and ?) and 1 female 
(number 2) are defective, leaving 48 male and 43 female crania available. 

C. Crania from the Gaboon, Group LI, brought by Du Chaillu from Fernand Vaz in 1880. 
Two of the 18 males (numbers 12@ and 20) and two of the 19 females (numbers 8 and 18) 
are defective. 

Altogether 110 male and 81 female crania have been dealt with. The correlation has been 
calculated of the capacity (C) and the product of the breadth, length and total height (B, Z 
and 7), for each group and for the aggregates of 110 male, and of 81 female crania. 

Correlation coefficients have also been calculated for the capacity and breadth, capacity and 
length, and capacity and total height, but for the aggregates of the three groups only. Re- 
gression formulae are given in all cases. It is to be observed that Dr Crewdson Benington’s 
measurements of capacity were taken with mustard seed, packing and measuring glass and 
that the error of measurement or rather his average difference as compared with other workers 
in the Biometric Laboratory was under 10 cm*. 

In comparing the regression formulae obtained here, with those given by Lee and Pearson for 
European and other skulls it must be remembered that in all their formulae except (12) and (13) 
of p. 247 they employed the auricular height and not the total height. In the present paper as 
in Dr Benington’s study H denotes the total height. Lee and Pearson denote this by H’ and 
use H for the auricular height. 

It was not possible here to use the auricular height as it was not available for the whole of 
the Gaboon series B and C. 

* Phil. Trans. Vol. 196, Series A, pp. 225—264, 
+ Biometrika, Vol. vit. Nos. 3 and 4, Dee. 1911. 


q 
4 
| 
i| 
| 
{| 
{| 


Miscellanea 189 


Taking first the male skulls, the mean value of the capacity and the product BLH, their 
standard deviations and the correlations are given in the following table. 


TABLE I. 
| 
46 Congo Skulls... 1344 3303 126-22 282-99 “872 
48 Gaboon (1864) ... 1379 3295 108-30 230°30 “822 
16 Gaboon (1880) ... 1447 | 3463 | 109-60 266°42 “808 
| | | | 
| 
110 Negro skulls... 1375 | 3323 | 120°74 | 265 °20 842 | 
The corresponding regression lines are 
for the 46 Congo | C=-0003889BLH+ 59 (1), 
48 Gaboon (1864) | C=-0003865BL 1+ 105 (2), 
16 Gaboon (1880) | C=-0003323BLH+4297 + (3), 
110 male negro skulls C="0003849BLH+ 96 + (4). 
n 


Lee and Pearson’s corresponding equation for males is 


This is not a regression line, but is obtained by method of least aquares from the results for 
various races in their table 20. 


The formulae 1—4 can be used to predict the capacity of an individual skull from external 


measurements. The probable errors of the mean were calculated by the formula 0°674490, —" 

n 
where 2 is the number of skulls in the group to which the formula is applied. If we substitute 
in (1)—(4) the mean values of B, Z, H for the Bavarian male skulls used by Lee and Pearson, 
viz. : 


B =150°5, 
=180°6, 
we obtain, from (1), C= 147444 = 
» (2), C=1471+ 
65 
3 C= 1506 + 
” ( 
4 = 1496 ==. 
( +7 


* Loc, cit, Equation (12). H’=total height. 


1 
| 
| 


190 Miscellanea 


The measured capacities of these German skulls have a mean value of 1503 c.c. a result 
which is in very close agreement with (4) the formula based on 110 skulls. 1503 is the mean 


capacity of 100 skulls so that o=65. Thus the difference between the actual mean capacity 
n 


of German skulls and the mean capacity estimated by the negro formula is less than 10 cm’. 
although the mean capacity of German male skulls exceeds that of negro males by 
1503 — 1375=128 cm‘. 
If the above values of B, LZ, H are substituted in Lee and Pearson’s formula (P) on p. 4 
we obtain C=1492. 


On the other hand if we substitute the mean values of the dimensions of the 110 male negro 
skulls, B=137, L=178, H=135 in formula P we obtain C=1400 as compared with the measured 
mean of 1375. 


This is not as good a reconstruction as our formula (4) or as the formulae of Lee and Pearson 
employing auricular height, and is probably due to the fact that P is obtained by the method of 
least squares from 11 means only. 


§ 4. An approximation to the influence of the thickness of the bone of the skull on pre- 
dictions of capacity from external measurements can be obtained by differentiating the equation 
C=kBLH+const. 

and putting dB=dL=dH=t. 
We obtain dC=k(BL+LH+HB)t, 


or if we observe that in the equations the constant is comparatively small 


dC 
with B=150°5, L=180°6, H=133'8 
dC 


15007 (02) approximately. 


Thus a difference of 10 cm*. in capacity corresponds to a difference of 4mm. in thickness 
which is about 5°/, of the thickness (say 6 mm.) of the human skull. 


We may fairly conclude then, that there is no appreciable difference in the thickness of the 
negro skull as compared with the European. 


§ 4. The female crania yield very similar results. The following is the table for the female 
skulls. 


TABLE II. 
Mean capacity | Mean BLH : 

in cm.3 in em.3 in cm.* "o, pun | 
21 Congo Skulls... 1206 2858 107°7 268 9077 
43 Gaboon (1864) ... 1232 2924 126°7 270°95 8814 | 
17 Gaboon (1880) ... 1240 2964 97°31 265°8 8560 | 
| } 

81 Negro skulls... 1227 | 2956 117 255°72 7668 


dg . 
x 
a 
3 


Miscellanea 191 


The corresponding regression lines are 


21 Congo skulls C= 0003645 BLH+ 164+ (5), 
n 
43 Gaboon (1864) | C=-0004122 BLH+ 27 (6), 
n 
17 Gaboon (1880) | 43114 (7), 
Vn 
81 Negro skulls C= 0003508 BLH +204 + (8). 
vin 
The corresponding Lee and Pearson formula obtained by the method of least squares is 
The mean values of B, Z, H’ for the Bavarian female skulls discussed by Lee and Pearson are 
B=14411, 
L =173°59, 
=128°07. 


With these values, we deduce from 5—8 the following values for C. 


(5) @=13314 
n 
60 
6) C=13474+—= 
6) 


50 
(7) C=13154+— 
Jn 


75 
8) C=1327+—— 
The mean of the measured values of the capacities of these skulls is 1387 and formula (8) 
based on 81 negro skulls gives a result in very close agreement. 


If the above values of B, L, H’ are substituted in Lee and Pearson’s formula Q we obtain 
C=1284 a result which differs from the true value much more seriously than the prediction by 
the negro regression formula. 


Again, if we insert the mean values 


B =130°75, 
L=171°33, 
H=129°81, 


of the 81 female negro crania in the formula Q we get C=1266 as against the mean of the 
measured values which is C=1227, demonstrating again the fact that the formulae P, Q based 
on 11 means are not as good as the regression formulae. 


§ 5. We add tables of the correlation between capacity and breadth, capacity and length, 
and capacity and total height for the 110 male and the 81 female skulls, and for comparison 
reprint the corresponding value for German (Bavarian) skulls. 


192 Miscellanea 


TABLE III. 
Correlation Males. 


Negro | German 


Capacity and Breadth ... *4977 
6080 (total height) 


Cap2city and Height 


“6720 


Capacity and Length ck 7433 | 5152 


TABLE IV. 


Females. 


Negro | German 


Capacity and Breadth ... ‘7578 
Capacity and Height... 5450 
Capacity and Length ... “6699 


“7068 
(total height) 4512 (auricular height) 
6873 


The corresponding regression lines are given in the tabies below : 


TA 


BLE V. 


Males. 


Negro 


(9) C=12°6356B 
(10) @=12-8301Z 1087 


(11) C=15°3265H’— 694 


(H' =total height) 


German 


C=13°432B —517°34 


C=9°892L — 282°55 


C=5'264H + 868°05 (auricular height) 


TABLE VI. 


Females. 


Negro 


German 


(12) C=17°872B- 


(13) C=1246 L- 
Vn 


(14) C=10-871H'- 1844 28 


ti 
(H’ =total height) 


C=15'716B —927°66 
C=12°055Z — 755°53 


C=10'993H+ 82°13 (auricular height) 


*2431 (auricular height) 


| | 
| 


| | 
| 
| 


Biometrika, Vol. X, Part | | Plate X | 


Dr Maynard’s Piebald Negro 


| 
| 
a | 
ar 
i 
| 


Miscellanea 193 


No great degree of accuracy can be expected in reconstructing the capacity of a skull from a 
single measurement, but the remarkable difference of formula (11) for negro skulls from the 
corresponding German formula is of course due to their referring to different measurements 
of the height. If we insert H=133'8 in (11), which is the mean total height of the Bavarian 


skulls we get C=1356'7+ instead of the measured mean C= 1503 
n 


Similarly equation (9) gives C=1555°6+ = instead of 1503 when we insert the German 
mean B=150°5. 


Thus 9—14 are of little use for our purpose. 


VII. Note on a Negro Piebald. (C. D. MAayNnarbD.) 


THE remarkably interesting photograph of a negro piebald on Plate X has been forwarded 
to the Editor by Dr C. D. Maynard. The native comes from the district round Chai Chai. 
Dr Maynard writes from Ressano Garcia, and states that the hospital attendant took the 
photograph. The extraordinary interest of the case arises from the fact that the thighs and 
feet are of normal negro pigmentation, but in the other patches we have varying degrees of 
pigmentation of the skin down to albinotic white. Unfortunately there is no dorsal view, but 
the back is stated to be also affected with albinotic areas. The boy reported that he was 
in the same condition when born, and that the nature and areas of the pigmentation had not 
altered. 


VIII. Note on Infantile Mortality and Employment of Women, from 
the Report on Condition of Woman and Child Wage-earners in the United 
States, Volume XIII. Infant Mortality and its Relation to the Employment 
of Mothers. 

By ETHEL M. ELDERTON. 


THE author of this Report emphasizes the difficulty of determining the effect of women’s 
employment and points out that 


‘“‘It would be possible to draw positive conclusions as to the relative importance of this particular 
factor only by point-to-point comparison of the infant mortality for a period of years in two large 
communities, or two classes of large communities, in which all the material conditions were sub- 
stantially common, with the single important exception that in one a considerable proportion of the 
married female population of child-bearing age were at work outside of their homes and in the other 
community with which the comparison was made none of the women were so employed. 


To admit of entirely sound conclusions, it would be necessary that the populations—and especially 
the women—of both communities should be of like ages, races, and physical health, that their living 
conditions should be practically identical, and that, in a general way, the child-bearing women should 
be of about the same grade of intelligence....... In default of some such comparison on a broad scale of 
the mortality of the infants of working and non-working women of similar ages, races, intelligence, and 
living conditions, no one can determine accurately how many of the deaths of working women’s infants 
are due to the mother’s work and how many to the other conditions of their lives and environment.”’ 
(p. 18). 


The author illustrates the point by taking the six New England States and giving the infant 
deathrate, percentage of women of 16 years and over who are breadwinners, percentage of foreign- 
born to the population and percentage of population living in towns of 4000 and more inhabitants, 
and showing that, though the states with the highest infant mortality have also the largest 


Biometrika x 25 


| 


194 Miscellanea 


number of women employed, they have also the largest percentage of foreign-born and of those 
living in urban surroundings, and that it is therefore impossible without further investigation to 
assign the infant deathrate to any of these three factors. 


A further investigation has been undertaken into the 32 Massachusetts cities and the death- 
rate under a year is given, the percentage of foreign-born, the births per 1000 of the population*, 
the percentage of women gainfully employed and the percentage illiterate, and a comparison is made 
between the ten cities with the highest and the ten cities with the lowest infant deathrate and 
percentage of women employed and the other factors enumerated. The conclusion is reached that 
“These comparisons indicate, superficially at least, that a more direct relation exists between 
infant mortality and the birthrate, the percentage of foreign-born, and the percentage of female 
illiteracy than between infant mortality and the employment of women.” (p. 38). 


There can be no doubt that a direct study of the infant mortality in relation to women’s 
employment can only properly be made, when we confine our attention to women, employed and 
unemployed, who are actually mothers and live in the same town, and when we correct for aget, 
and if possible home conditions. Still if we take a series of different towns the right method 
must be to correct by the method of partial correlation for such divergent factors as we are 
able to ascertain and allow for in the series of towns investigated. I have endeavoured to apply 
modern statistical methods to the data of this Report, taking as measures of the environmental 
conditions in the towns: D the general deathrate, 7=percentage of illiteracy, f= percentage of 
foreign-born population, e=percentage of females employed 10 years of age and upwards (note, 
not percentage of employed mothers, so we may be largely measuring effect of child labour on 
future motherhood), and d=deaths under one year per 1000 births. Then we have for cor- 
relations : 

Ne = 68, "70, Yas= "74. 
Hence numbers of foreign-born and of illiterate appear to be slightly more influential on infantile 
mortality than employment of women. These values are certainly high and the first is the sort 
of crude value which is used as an argument against the employment of women. Proceeding to 
partial correlations we have 

fu="42, pra 
We next corrected for two factors and found : 

"34, 12, "43. 

Thus we see that illiteracy has least influence on the infantile deathrate and the presence of 
foreign-born most. 


But even the presence of foreign-born and of illiterates is not a very complete measure of 
environmental effects liable to influence the infantile mortality in different towns as apart from 
employment of women. Many women employed means industrial conditions and possibly 
generally bad environment. I have taken as a measure of this the general deathrate D and find 

Tpa=""1, Tpe="47, *pi="60, rps="49. 
Whence I find : 
‘57, 62, Dras= “75, 
pra="61, pris ="68, 


showing very substantial relations after correction for a general measure of poor environment. 


* The author is not very confident of the full accuracy of the complete registration of births. 
+ Young women are often employed up to the birth of their first one or two children, but the death- 
rate of these elder-born is heavier than the deathrate of those who immediately follow. 


| 

| 
4 
E 
| 
| 
4 
| 
| 


Miscellanea 195 
Next proceeding to allow for two factors we find 
{Dai = {Da = “44, de “35, 


the latter result shows that general deathrate and illiteracy are about equally influential on the 
relation of employment of women to infantile mortality. Finally I corrected for all three factors 
and found : 


ae = "28 


or 60 °/, of the crude correlation rz,=°68 is due to women being most employed in towns where 
the general deathrate is high, where illiterates are frequent and the population is largely foreign- 
born. How much further the relationship would be reduced, could we equalise other features of 
these Massachusetts cities, it is not possible to predict. The examination of the individuals in 
one city appears to me to be the only satisfactory method of disentangling the numerous factors 
which influence infant mortality. We commend, however, the study of the first part of this 
Report, as it deals very clearly with the difficulties which arise, and will counteract the tendency, 
which is prevalent, to assert causation whenever association is observed. The author lays stress 
on avoiding such logical confusions. 


Part Il of the Report deals with infant mortality and its relation to the employment of 
mothers in Fall River, Massachusetts. In 1908 the attempt was made to visit the homes of 
each of the mothers of the 859 infants who died during the year and to ascertain details con- 
cerning her occupation, etc. In 279 cases the family could not be traced. In 266 cases prior to 
the birth of the child the mother was at work outside the home while in 314 cases the mother’s 
work was limited to household duties or other work carried on entirely at home. Thus only the 
cases of deaths are dealt with and the causes of death are compared in the two groups of cases 
(1) when the mother was at work outside the home prior to the birth of the child and (2) when 
the mother’s work was carried on entirely in the home. 


I hold that this method will never prove as satisfactory as that employed in districts in 
England ; in England certain districts are chosen and every baby within that area is visited 
and the deathrate per number born in one group can be compared with another and the 
circumstances surrounding those babies who survive and those who die in the first year of life 
in a given district can be analysed. 


I do not think that the fact that a rather higher percentage of all deaths from gastritis etc. in 
Fall River occur when the mother works away from home and a rather higher percentage from 
congenital debility at birth when the mother does not work away from home will help us much 
in discovering the influence of the employment of the mother on infant mortality, nor do I think 
it will throw much light on the question of stillbirths with which the Report also deals. It is 
found that there are no more stillbirths proportional to all deaths when the mother is industrially 
employed, but it seems to me that this tells us nothing about the number of stillbirths pro- 
portional to all births. The real question is whether mothers employed away from home in 
factory or workshop, whose other circumstances are the same, lose more children in the first year 
of life or have more children stillborn than the mothers who are only employed in their homes 
and I do not think a comparison of causes of death will lead us much further, and I think it may 
lead to difficulties. 


When dealing with the mother’s work after childbirth in relation to the causes of infant 
mortality it is pointed out that the smaller percentage of deaths from congenital disease among 
the children of mothers who returned to work after childbirth was owing to the fact that most 
of the children dying from this group of causes died in the early weeks of life before the mother 
returned to work. For this same reason the number of deaths from gastritis etc. of children 
whose mothers returned to work is exaggerated, for we are missing out a whole series of illnesses 


196 Miscellanea 


which have ceased to add to the child deathrate by the time the mother returns to work and we 


must increase in this way the percentage of deaths of any disease of the later months of a child’s 
first year of life. 


It seems to me that a comparison of deaths in this way will really give very little information ; 
an excess of deaths from one disease means a defect in some other disease ; it is shown that 
when the baby is nursed exclusively by the mother 26-0 per cent. of the deaths were from 
diarrhoea, gastritis, etc.; when partly nursed the percentage was 52°3 and when artificial food 
was exclusively employed the percentage of deaths from diarrhoea etc. was 429; the baby 
certainly dies less from gastritis when it is breast fed but it dies in greater numbers from other 
causes. Here again there is a difficulty; deaths from congenital diseases fall on the first weeks 
of life when breast feeding is the rule, while deaths from gastritis etc. fall on the later months of 
child life when “partial breast feeding” has become more common and I do not think it is 
possible to draw any conclusions from a comparison of deaths from one disease to deaths from 
all diseases as to the importance of artificial feeding in relation to deaths from gastritis. 


Interesting information is given as to the reasons for artificial feeding ; the numbers are not 
large enough to justify any definite conclusions, but this is such an important part of any inquiry 
into the influence of artifical feeding on the infant deathrate that one welcomes its inclusion in a 
report of this kind. 


WE have been requested by Professor F. M. Urban to insert the accompanying announcement. 


ANNOUNCEMENT. 


A prize of One Hundred Dollars ($100.00) is offered for the best paper on the Availability of 
Pearson’s Formulae for Psychophysics. 


The rules for the solution of this problem haye been formulated in general terms by William 
Brown. It is now required (1) to make their formulation specific, and (2) to show how they 
work out in actual practice. This means that the writer must show the steps to be taken, 
in the treatment of a complete set of data (Vollreihe), for the attainment in every case of a 
definite result. The calculations should be arranged with a view to practical application, i.e. so 
that the amount of computation is reduced toa minimum. If the labour of computation can be 
reduced by new tables, this fact should be pointed out. 


The paper must contain samples of numerical calculation, but it is not necessary that the 
writer have experimental data of his own. In default of new data, those of F. M. Urban’s 
experiments on lifted weights (all seven observers) or those of H. Keller’s acoumetrical experi- 
ments (all results of one observer in both time-orders) are to be used. 


Papers in competition for this Prize will be received, not later than December 31st, 1914, by 
Professor E. B. Titchener, Cornell Heights, Ithaca, N.Y., U.S.A. Such papers are to be marked 
only with a motto, and are to be accompanied by a sealed envelope, marked with the same motto, 
and containing the name and address of the writer. The Prize will be awarded by a committee 
consisting of Professors William Brown, E. B. Titchener and F. M. Urban. 


The committee will make known the name of the successful competitor on July 1, 1915. 
The unsuccessful papers, with the corresponding envelopes, will be destroyed (unless called for 
by their authors) six months after the publication of the award. 


Corrigendum. Dr Derry has most kindly pointed out a slip on p. 307, Vol. VIII; the value 
of 100 (B—H)/Z for Congo female crania is +1‘9 and not —1°9, which brings these crania nearer 
to their proper place, and the remarks on this point p. 308 should accordingly be cancelled. 


| 


an 
4 


